Catalyzing collaborations: Prescribed interactions at conferences determine team formation

Collaboration plays a key role in knowledge production. Here, we show that patterns of interaction during conferences can be used to predict who will subsequently form a new collaboration, even when interaction is prescribed rather than freely chosen. We introduce a novel longitudinal dataset tracking patterns of interaction among hundreds of scientists during multi-day conferences encompassing different scientific fields over the span of 5 years. We find that participants who formed new collaborations interacted 63% more on average than those who chose not to form new teams, and that those assigned to a higher interaction scenario had more than an eightfold increase in their odds of collaborating. We propose a simple mathematical framework for the process of team formation that incorporates this observation as well as the effect of memory beyond interaction time. The model accurately reproduces the collaborations formed across all conferences and outperforms seven other candidate models. This work not only suggests that encounters between individuals at conferences play an important role in shaping the future of science, but that these encounters can be designed to better catalyze collaborations.

The scientific enterprise has increasingly become a team effort [1][2][3][4]. Forming new and interdisciplinary scientific collaborations 1 is crucial to spurring the innovation necessary to address many current and future challenges facing society [6][7][8]. However, there are intellectual, technical, and logistical obstacles which impede the formation of new teams [9]. In particular, research has shown that geographical proximity is a factor in team assembly [10].
Conferences can help overcome these barriers and are one of the main catalysts for the formation of new scientific collaborations. However, convening conferences is expensive in terms of organizational, travel, environmental, and opportunity costs; the direct monetary cost for academic meetings alone is estimated at tens of billions of US dollars each year [11,12]. Given the extensive costs associated with in-person meetings along with benefits virtual conferences may offer in terms of equity and inclusion [13], the COVID-19 pandemic has prompted discussions about how scientific conferences should be held 1 In this paper, we will use the terms "collaboration," "team formation" and "team assembly" interchangeably, as these terminologies can all be found in the literature designating the same phenomenon [5].
even after it is safe to convene them in person [14]. Some posit that they should continue to be held virtually rather than in person [12], others that hybrid features should be included [15,16], and yet others that the lack of in-person interaction causes a significant damper on scientific productivity and innovation [17].
Past research has mostly focused on measuring various aspects of the success of scientific collaborations (understood as co-authorship on publications) once formed [18,19] and the makeup of successful teams (including, e.g., metrics such as the number of institutions present [20], team size [21], and team "freshness" [22]). There have been some efforts to study scientific team assembly (see, e.g., [3,9,23,24]), but little is still known about the impact of conferences on collaboration initiation. Some limited evidence, however, demonstrates that increased interaction among potential members raises the likelihood of team formation [25][26][27].
Here we present evidence that properly engineered interaction leads to collaboration, and we go beyond empirical observation by proposing a mathematical model for the origin of this phenomenon. Such a model has the potential to allow for optimization of conference design to promote collaboration. The model takes as input the pairwise levels of interaction among conference partici-pants as well as their pre-conference familiarity with one other, and estimates the probability that any pair of participants will subsequently form a collaboration. We test this model using data collected by the Research Corporation for Science Advancement (RCSA) during a series of "Scialog" conferences that they organized.

NOVEL "SCIALOG" DATASET
We construct a novel longitudinal dataset derived from a diverse set of conferences known as "Scialogs" [28]. Organized by the nonprofit funding agency Research Corporation for Science Advancement (RCSA), these conferences seek to accelerate the work of science through research, intensive dialog, community building, and by catalyzing new scientific collaborations on challenges of global significance. Scialog conferences last three days and have an interactive format, with the participation of around 50 fellows, who are invited early-career scientists, and around 10 facilitators, who are more senior scientists.
For each conference, we have detailed records of how well each participant knew the other participants before the conference, which sessions they attended, with whom they wrote proposals at the end of the conference, and which proposals were funded. Participants are assigned to larger topical discussion sessions of 8-10 fellows facilitated by 1-2 established scientists as well as small group sessions of 3-4 people. Prior knowledge between participants is measured on a four-point scale in a preconference survey (see supporting information (SI) for details). At the end of each conference, the participants may self-assemble into teams of 2-4 members to write proposals. A total of 20-35 proposals are typically submitted at each event, of which 5-8 are ultimately funded.
Scialog conferences are ongoing, so the dataset is still expanding, in particular to include virtual conferences instituted during the pandemic. In its current state, it is comprised of 12 conferences taking place from 2015-2020, divided into four multi-conference series. Each series of conferences deals with a particular scientific topic and most participants return from one conference to the next. For simplicity, in this paper, we have used only data corresponding to the first year of each series to test the model without effects of participants returning in subsequent years. Thus, we used data from four Scialog conferences [28] from 2015-2018, corresponding to 254 total participants and 4,897 potential pairs of scientists who could form a collaboration 2 . For the purpose of this paper, pairs are treated as the fundamental unit of collaboration, and a pair of participants is considered to have formed a collaboration if they were part of the same proposal submission team (triads and tetrads are treated as 2 Some pairings were prohibited because of prior collaborations. sets of collaborating pairs). Table I summarizes descriptive statistics for each of the four conferences. Future  reports will focus on how the likelihood of collaboration  changes across the multi-year arc of each initiative as the  individuals get to know one another better, on longerterm collaborations measured by co-authorship, and on  effects of virtual versus in-person conferences. EMPIRICAL RESULTS

Interacting More Leads to Collaboration
We first tested whether pairs who collaborated were different from pairs who did not in terms of total effective interaction I tot (see appendix Defining Interaction).
To do so, we employed the Mann-Whitney U test [29]; metrics were computed for each conference individually as well as aggregations of the conferences. All metrics indicate that collaborators have significantly more interaction than non-collaborators (see left panel of Fig. 1). The right panel of Fig. 1 shows that collaborators spent 63% more total effective time together than non-collaborators on average across all conferences. This is equivalent to being in a group of 12 people for an extra 45 minutes (60% of the duration of a topical discussion session) or being in a group of 4 people for an extra 15 min (50% of the duration of a small group session).  We also tested whether there was a difference in prior knowledge (K 0 ) for pairs who collaborated versus those who did not (again using the Mann-Whitney U Test) and found that there was a statistically significant difference at Conference B (p = 6.6 × 10 −5 ), Conference C (p = 0.0049), Conference A at the 10% level (p = 0.063) and not at Conference D (p = 0.86). 3 To disentangle causality from correlation for the effect of interaction on collaboration initiation, we performed a test based on counter-factual schedules for one of the conferences. To assign participants to sessions, the conference organizers used a simulated annealing algorithm that attempted to optimize placement based on participant characteristics and information from a preconference survey 4 . Though these assignments were not random, there were numerous options with equivalent or nearly equivalent scores, leaving the organizers to choose among them based on other criteria. We selected the top 50 alternative solutions for (larger topical) discussion session assignments and the top 50 alternative solutions for small group session assignments, which combined led to 2,500 counter-factual conference schedules. Note that any one of these schedules could have been chosen as the true conference schedule.
For each counter-factual schedule, we computed the mean interaction 5 for all pairs that, in the actual conference, ended up collaborating; we denote this as I i CF (where integer i ∈ [1, 2500] indexes the particular counter-factual schedule). It is of interest to compare this quantity to the mean interaction according to the actual conference schedule, which we denote I A . If interacting more had no causal impact on collaboration, we would expect to see little difference in these numbers, with I i CF > I A for some i (i.e., under some counterfactual schedules) and I Instead, what we found was that I A was nearly always much greater-this was true for more than 99% 3 Interestingly, we found no significant effect of interaction or prior knowledge on whether or not a proposal was funded. 4 The goal was to generate mixtures of participants with varied research methods, disciplines, genders, and minimal professional connections, while also avoiding repeated assignments to groups with similar members in different sessions (thus increasing the number of individuals each participant interacted with). 5 Here for brevity and clarity we use the word "interaction" to mean total effective interaction over the course of the conference. of the counter-factual schedules. The only cases where I i CF > I A was observed corresponded to counter-factual scenarios sharing the same exact small-group session assignments but with variations in the larger topical discussion session assignments. Note that this method enabled us to blindly recover the small group assignments knowing only which pairs ultimately collaborated, which strongly suggests a causal connection between intense interaction in a small-group setting and team formation. See Fig. 2 for a graphical display of this result.
To quantify the statistical significance of this result, we performed a Wilcoxon signed-rank test [30]. The null hypothesis that the distribution of I A − I i CF has zero median is rejected at the 10 −5 level of significance.

Effect Size
In addition to showing that interaction has a statistically significant effect on collaboration probability, we also wish to know the size of the effect. To evaluate that, we restricted our data to pairs with initial knowledge K 0 = 0 (N=984) and used bootstrap statistics to estimate the odds of collaboration for pairs who co-attended one mini session (0.15, 95% CI [0.10 0.21]) and those who did not co-attend any mini session, but could have in one of the 2,500 counter-factual scenarios (0.017, 95% CI [0.0085 0.028]). In this case, co-attending a mini-session multiplied the chance of a pair collaborating by 8.7.

MODEL FOR THE DYNAMICS OF TEAM FORMATION AT CONFERENCES
Beyond empirical observations, we develop a mathematical model for the dynamics of team formation at conferences. In 1996, the physicist Serge Galam wrote: "Do humans behave like atoms?" [31]. The mathematical model we present is based on the idea that scientists at a conference behave like molecules in a solution, where formation of a collaboration is analogous to undergoing a chemical reaction. The conference itself acts as a catalyst by lowering the barriers to collaboration and creating more "productive collisions" among the scientists who participate.

Linear Model
As a first simple model, consider a pair of attendees at a conference. We assume that collaboration probability P (t) rises for nonzero interaction intensity I between participants, and when interaction ceases, probability of a collaboration forming decays. For simplicity we assume linear growth and decay processes, leading to the following ordinary differential equation (ODE) governing the change in collaboration probability over time: (1) This linear model has been constructed to reflect the above assumptions and allow for exponential approach to P = 1 at "strengthening" rate S when I = I max (with I max the maximum possible interaction intensity), and exponential relaxation to P = 0 at "weakening" rate W when I = 0. W and S are assumed constant. The model can easily be generalized to incorporate more realistic bounds on the collaboration probability: (2) Here P min > 0 would allow for two participants to form a collaboration even though they have not interacted (for example by being brought together by a third party), and P max < 1 allows for non-collaboration even if two participants interact maximally. Two key parameters control the dynamics of this model: the strengthening:weakening ratio and the relative intensity of interaction.
(2) can be solved exactly for simple functions I(t). For example, for constant I, P (t) relaxes exponentially to its equilibrium value P * . To understand the changes in collaboration probabilities that may occur during the course of a conference, we focus on the case where I = I(t) is not constant, representing the time-varying strength of interaction between two individuals. Fig. 3(a) shows a "realistic" looking example for a 2-day conference, with three sessions of different lengths and intensities, one small (4 people) and two larger (12 people) in this example. Note that I(t) is a dimensionless quantity-for more details on how I(t) was constructed from data, see appendix Defining Interaction below.
We can express the right hand side of Eq. 2 as the derivative of a potential function V (P ): where the minus sign implies that stable equilibria occur at potential minima. Since Eq. 2 is linear, the resulting potential is quadratic, with a local minimum somewhere in 0 ≤ P ≤ 1 (depending on I).

Nonlinear Catalysis Model
We expect the linear model described above to capture some of the dynamics of formation of new scientific collaborations during conferences. One major limitation, however, is that the linear model relaxes to P min after interaction has ceased, which implies that participants completely forget one other. For a more realistic generalization, we wish to allow scientists who have interacted sufficiently to remember one another long after the interaction has ceased. To implement this, we modify the potential landscape for a new nonlinear model as shown in Fig. 3(d).
When interaction I = 0, there are two stable equilibria, one at the minimum probability P min , and the other at memory state P mem , with an unstable equilibrium in between. As the interaction increases, it acts as a catalyst by changing the shape of the potential function and reducing the barrier between the two stable states. At a critical value of the interaction I c a bifurcation occurs and the barrier disappears, leaving only a single stable equilibrium. If the system gets sufficiently close to that new equilibrium before interaction ceases, the probability will remain permanently in the higher memory state P mem .
The exact form of the potential function we employ (a piecewise quadratic function) was motivated by setting rates of strengthening and weakening to be consistent with the original linear model, so curvature of the potential function is higher to the left of each fixed point (strengthening) and lower to the right (weakening). Table II summarizes the variables and parameters in the model. See SI for exact form.

Parameter Fitting and Model Selection
We validate the nonlinear catalysis model by testing how well it explains which pairs of participants ultimately collaborated. The probability of collaborating is the output of the model at time t = T Collab (see Fig. 3), the start of the period allocated for team formation and proposal writing at the end of the conference.
The model was solved numerically with a 4th order Runge-Kutta method assuming initial condition P = P min for all pairs. We found optimal parameter values for each conference by minimizing the negative loglikelihood of the model using a constrained Nelder-Mead simplex algorithm [32] with a grid of initial values. Fig. 4 shows agreement between model predictions and data for how collaborations increase with interaction. Of the 85 data points, 84 are contained within the 95% confidence interval predicted by the model.
We compared the quality of the nonlinear catalysis model to several null models by computing the Akaike Information Criterion (AIC) [33] and relative likelihood for each one. These included: (1) a random probability of collaborating between 0 and 1 for each pair, (2) a constant probability of collaborating between 0 and 1 for all pairs, (3) linear function of prior knowledge K 0 , (4) linear function of total effective interaction time I tot , (5) linear function of K 0 and I tot , (6) a threshold-model where the probability was P mem if the interaction was ever greater than a critical value I > I c and P min if not and (7) the linear model described earlier. Note that candidate models (3)-(7) can be viewed as approximate limits of the nonlinear catalysis model for certain parameter values. Table III shows that the AIC of the nonlinear catalysis model is lower compared to the next best model for conferences A, B, C and D, indicating that it is the preferred model for all four conferences.

DISCUSSION AND LIMITATIONS
Our analysis is predicated on the quality of interpersonal interactions as well as their quantity. Impromptu meetings around the coffee maker may differ from those at a conference where participants were encouraged to have a specific conversation and incentivized to form teams though a grant-awarding process, as was the case in the dataset analyzed here. This may limit the contexts in which our model can be applied.
Though we have only tested our model explicitly on the short time scale of one conference, it seems likely that similar dynamics also play out on a multiyear time scale, corresponding to the typical time from project-inception to scientific publication. Future work will address this question explicitly 6 .
Our model is probably most applicable when conference participants have limited initial familiarity. Table  I shows that the mean prior knowledge among pairs of participants at conference B was almost five times that of conference D (where participants knew each other the least). That is consistent with the observation that inperson interactions were less significant drivers of collaboration formation at conference B. In a similar vein, Table III shows that the relative likelihood of the nextbest model is highest for conference B, and that next best model is aK 0 + b, which accounts only for prior knowledge.
A limitation of this work is that we do not explicitly account for many issues likely affecting team assembly such as, e.g., personality characteristics, homophily, and distance between research fields. However, our approach implicitly incorporates these to some extent through its probabilistic nature. They could also be explicitly incorporated in a more complex future model, but we see the success of the nonlinear catalysis model as remarkable precisely because of its simplicity.
In our definition of I tot , we have made the simplifying assumption that interaction time is equal for all participants during a group session, despite varying speaking times. Although this approximation does not capture potentially important effects (see, e.g., [34][35][36]), it is a neutral choice absent explicit speaking-time data.
We have also made the simplifying assumption that pairwise dynamics are the primary drivers of collaboration, but more work is needed to quantify the impact of triads, tetrads, or other multilateral interactions. This may, however, be partially compensated for in our model through the incorporation of parameters P min and I min .
The Scialog conferences are not representative of all types of conferences, as their goal is explicitly to generate new collaborations and they provide a financial incentive in the form of grants for participants to do so. An extension of this work could study how collaborations are generated at larger conferences or conferences among scientists in regions of conflict where the barrier to collaboration is higher [37]. Research is also needed to compare the benefits and drawbacks of in-person and virtual conferences, especially in terms of the formation of new collaborations.

CONCLUSIONS
We have found evidence that prescribed exposure (through structured group interaction) leads to team formation at scientific conferences, and that more interaction leads to a higher probability of forming a collaboration. The nonlinear catalysis model we developed performs better than any other model tested, suggesting that the memory effect it incorporates is key to understanding team formation. The nonlinear catalysis model is not necessarily limited to scientific conferences and collaboration-we speculate that it may also have applicable extensions in other areas where matches between individuals within a network are sought. For example, in business settings, employers may wish to promote organic team-building through prescribed sessions among employees. In romantic contexts, a model could inform online dating algorithms and approaches to social interaction. In pedagogical settings, educators might use in-class prescribed group exercises to promote formation of student study groups or teams for collaborative assignments.
Scientific conferences play an important role in the diffusion of knowledge and can generate novel ideas. We have shown that properly engineered interaction at conferences induces the formation of new scientific collaborations. Our model helps to illuminate the mechanism by which this occurs, and we hope that it will play a role in designing future conferences such that benefits are maximized and the scientific enterprise is made to proceed as efficiently as possible.

Appendix: Defining Interaction
For each pair, total effective interaction I tot is defined as the sum of total effective interaction time during sessions. Total effective interaction during a co-attended session was taken to be proportional to the time one participant spent listening to the other, under the unrealistic but convenient assumption that all participants spoke equally. Thus, for a given pair of participants coattending a session of time T k with N k participants, we assumed: where the numerator is the total time spent listening to others and the denominator is the number of people to listen to. When the pair are in different sessions, I session k = 0. Normalizing so that when N k = 2, Then with k the index of the session and m the number of sessions (here, 6 ≤ m ≤ 8), The pairwise interaction intensity profile I(t) was constructed in a similar fashion. To get instantaneous interaction intensity, we divided by the session length T k and chose units such that a 2-person session would have maximum intensity I max . We assumed a minimum interaction term (corresponding to informal interaction between sessions) I min = 2/N tot that depends on the total number of participants N tot . We then added a term proportional to the initial knowledge K 0 7 . Eq. A.6 summarizes the interaction function as it was implemented.
in non co-attended sessions 2 N tot outside of session times Fig. 3(a) shows an example interaction function, with T = 0 corresponding to one hour before the start of the 7 For simplicity in the model exposition, we considered K 0 = 0.
When K 0 > 0, the model parameter Imax needs to be rescaled by the new maximum possible interaction, 6a + 1 (from a hypothetical 2-person session for a pair with maximal K 0 = 6).
first topical discussion session (see SI for more example interaction functions). The authors acknowledge the United States Department of Agriculture NACA 58-3022-0-005 and the Research Corporation for Science Advancement for providing data and assistance. E.R.Z thanks the National Science Foundation Graduate Research Fellowship Program DGE-184216 and the Northwestern Buffett Institute Global Impacts Graduate Research Fellowship for financial support and Maher Said for productive discussions. We write the potential function for the nonlinear catalysis model as a piecewise function of P dependent on the interaction intensity I. Although conceptually straightforward, its algebraic representation appears complicated because of its piecewise nature. Because of that, we first present two simplified special cases to illustrate its structure.

II. PRE-CONFERENCE SURVEY
Before attending the conference, participants and fellows were asked to complete the following survey. 100% of participants completed the pre-conference survey for conferences A, C and D and 98% for Conference B.

Prior Knowledge
For each name please choose one answer that best describes your relationship with that person prior to this Scialog meeting. There are four categories to choose from: Unfamiliar: You are not aware of the research of the person.
Awareness: Choose this option if you are aware of the research of the person. Examples of "awareness" would be knowing the person's specific area of expertise or knowing details of a recent publication.
Discussion: Choose this option if you have had a substantive discussion about research with this person, through face-to-face conversation, email correspondence, or other means. Please do not select this choice if you have talked with this person and exchanged only basic information about the areas you work in. This level of relationship is meant to be higher than the previous level of "awareness" and presupposes awareness.
Collaborator: Choose this option if you have ever worked on a project or written a paper together, or formally collaborated with this person on or toward a tangible research output. Please do not select this choice if you have only technically "collaborated" but have never had a substantive research discussion with this person (e.g., coauthored a paper with 100 authors but never interacted). This level of relationship is meant to be higher than the previous level of "discussion" and presupposes awareness and discussion.
Names are listed alphabetically.
Surveys are customized to each respondent. Your name will not appear on the list.

Interest in Discussion Topics
Please choose your interest level for the proposed discussion topics below. Your input will be used to select the topics for discussion groups at the conference and help us choose which groups you'll be in. Click the "details" button to see more information.
These topics are based on suggestions made by Scialog Fellows, including you, in the conference registration form. Our hope is you will be able to indicate at least a few, and perhaps many, that you are "really into" or would "chime in." The order of topics is randomized.
Respondents are asked to rate each topic on a 5-point scale: No way -Might nap -Would listen -Would chime in -Really into it.

Nominating critical discussion participants
Listed below are the topics you expressed interest in. For each topic, if you think another Scialog Fellow is an essential person to have in a discussion on that topic, please indicate them below. You may select up to two for each topic but aren't required to select any. Click on the box and start typing or scroll to select a fellow.
The pre-conference survey results were incorporated into the interaction function as "prior knowledge" K 0 for each pair of fellows (A,B) where K 0 is the sum of prior knowledge reported by A about B and B about A. Thus K 0 for each pair ranges from 0-6 where 0 represents both fellows being unfamiliar with each other and 6 represents both fellows reporting having previously collaborated.
The rules of the Scialog conferences do not allow for participants who have previously collaborated (i.e. pairs with K 0 ≥ 5) to be on the same proposal submission team. Therefore, when fitting the model to data, we eliminated pairs with K 0 ≥ 5 (2.1% of pairs at Conference A, 11.7% of pairs at Conference B, 3.1% of pairs at Conference C, 1.5% of pairs at Conference D).

III. GROUP ASSIGNMENTS
The group assignments were determined prior to the conference with the goal of creating diverse groups for the topical and small group discussion sessions. For the topical discussion groups most of the Fellows who were placed in a group had rated their interest in the topic as a 4 or 5 (on a 1-5 scale with 5 indicating the most interest). Fellows were not placed in a group if they rated the topic under 3. These assignments were accomplished while maintaining diversity in the groups in terms of academic disciplines, research methodologies (e.g. theoretical vs experimental methods), and gender. For the small groups, nearly all Fellows in a group had no previous awareness of the others' research and none had previously engaged in scientific discussions with the other group members. Participants were mixed so that most small groups included Fellows with different disciplines and methodologies. A simulated annealing algorithm was used to provide candidate groupings based on these criteria. Group assignments for all topical sessions were optimized simultaneously to minimize the same Fellows having repeated assignments together in different sessions. Similarly, all the small group sessions were optimized simultaneously so that no Fellows were ever placed in a small group with another specific Fellow more than once. The algorithm typically returns several solutions with the same or similar energy levels, especially in the case of the small group sessions since they are less constrained. The organizers made the final selection of the group assignments from among the several best solutions.
Participants who had previously collaborated were not allowed to submit a proposal together, and we therefore eliminated these pairs when fitting models to data. The median percentage of pairs omitted for this reason was 4.0%.