OPEN SOURCE SOFTWARE AND RESEARCH INFRASTRUCTURE
The UK features an impressive set of online systems and services meant to help researchers develop effective and innovative ways of conducting research. Nevertheless there is still some reserve in adopting these technologies to their full potential. This document suggests that the main issues are social and organisational rather than technological, and maintains that some key lessons for improving the current situation can be learned from open source development practices.
To begin with, let's look at an overview of the conclusions drawn from recent studies and interviews with researchers and service providers in the academic sector. A more detailed analysis is provided in two related documents, which present community and sustainability open development lessons.
Research infrastructure background
In recent years, the ways in which academic research is carried out has been dramatically challenged by the opportunities provided by internet. The internet has the potential to facilitate distributed collaboration between researchers, or e-Research, which in this context can be defined as research conducted in virtual communities across the academic and industrial sectors using specially designed online technical facilities and services. Members of these research communities have the ability to share, federate and exploit the collective power of global scientific facilities, supported by a technical framework that allows participation regardless of geographic location. This network of tools, resources and services that allow globally distributed scientists to collaborate on producing research outputs is known as research e-Infrastructure, or simply e-Infrastructure.
The research e-Infrastructure in the UK consists of a number of loosely connected or separate projects, tools and services. Some are general in scope and address the needs of various researchers. Others are quite specific and serve small groups of specialists. In both cases, e-Infrastructure is designed to foster new ways of conducting research and improving cross-subject collaboration. This is similar to the role assigned to research infrastructure at a European level. In the European Commission's research strategy, for example, e-Infrastructure is perceived as a key element of the new European Research Area, and an important tool for global scientific cooperation. e-Infrastructure is intended to provide an innovation space where the specific interests of scientists are met and cross-subject solutions for distributed research are deployed.
Embedding human and technical infrastructures
The technical deployment of a complex e-Infrastructure is only one step towards fostering new ways of conducting research. An equally important step is encouraging researchers to use this technical framework to its full potential. Addressing this issue, a number of reports mention the need to embed the technical research framework in a 'human infrastructure'. In this context, 'human infrastructure' refers to the social and organisational arrangements enabling technologies to be used effectively. The AVROSS report, for example, states that e-Infrastructure uptake is as often hindered by human and organisational issues as it is by technical ones. Focusing on the UK, it recommends continued technical innovation in e-Research, but at the same time improving the social framework that would allow research communities to better exploit these technical assets.
Other reports highlight similar issues. For example, the e-Research Community Engagement findings mention the need to correct the current focus on building hi-tech systems and tools at the cost of grasping the real context of their use (recommendation 8). The study suggests creating research intermediaries, or 'boundary spanners', who can act as facilitators between research domains and between research and technical staff (recommendation 10). This is similar with the role of community in open source development, perceived as essential for the software that is being built.
The vital role that community development can play in the success of e-Research is beginning to be acknowledged on a global scale. For instance, the US National Science Foundation funds a Virtual Organizations as Sociotechnical Systems program. The European Commission has a strand on Virtual Research Communities in its current FP7 call. In the UK, however, building communities around products and processes related to e-Research is not yet seen as a priority. A recent e-Research Community Engagement report suggests that UK funding bodies should also include future community engagement calls similar to the EU and US programs (recommendation 15).
Lessons from open source development
Building a 'human infrastructure', as these reports suggest, is not an easy job. In fact, it can be more challenging than building and making available the current array of technical infrastructure tools, as human nature is more complex than technology and affected by a host of factors rarely taken into account by software engineers. However, projects need not start from scratch in this process, as a fair amount of experience in building communities around technical systems already exists in the area of open source software development. Some of this experience applies to web-based collaboration in general; some is particularly relevant to building and maintaining software. The next two sections highlight how lessons learned from open development, particularly in the areas of community building and sustainability, could help to increase the uptake of e-Infrastructure by UK researchers.
In open source, encouraging users and developers to engage with the project is crucial for building a thriving community. In fact, building a sustainable product largely depends on forging an environment in which users and developers share a culture of mutual support and a sense of following a common goal. The effects of seeding similar ideas in research infrastructure communities could be substantial. For instance, they could help to resolve problems resulting from a lack of understanding between research stakeholders. [HOW?] They could also help service providers to identify the right level of guidance needed by researchers to engage with e-Infrastructure, without putting them off by excessive hand-holding. [link to relevant para in Community doc]
To create such a welcoming and 'want to come back' environment, it is essential to encourage contribution from outside the project, removing any barriers that could prevent users' access to the community. Successful open source projects make it clear that all are welcome to make their contribution in their own way. Those who do not have technical skills are also important, as they may become future users. They can also support the project by doing other tasks, from simply asking questions that get answers that may be useful to newcomers, to testing and giving feedback on new releases, writing documentation and promoting the product to their friends. In the research infrastructure context, the benefit of non-technical involvement may be disseminating success stories about the use of e-Infrastructure, or helping service providers to refine their products, documentation and training in response to user feedback. [link to relevant para in Community doc]
Another lesson that e-Research can learn from open source communities is the importance of adapting tools to fit community needs. Do the provided infrastructure tools help users to ask their research questions and carry out research in an efficient and consistent manner? Are these tools easy to adapt to the specific needs of the researchers? For instance, the open development practice of using version-control systems to synchronise systems accessed by distributed users could be useful to infrastructure service providers who manage multiple versions of Grid nodes, which also need to be kept in sync [link to relevant para in Community doc]. Another useful open development practice is the use of formats and standards that allow external developers to easily engage with the software. Also, it is essential that the researchers themselves are able to easily build on e-Infrastructure tools, and the output formats of these tools are diverse enough to allow them to choose one that best suits their needs [link to relevant para in Community doc].
A more detailed account of open source community lessons that may benefit research infrastructure communities is available in a related document.
A key aspect of building open source communities is encouraging collaboration from the earliest days of developing the software. In line with the famous open source dictum 'release early and often', developers are supposed to allow free access to the code from day one, despite its tentative nature, and to encourage all to contribute. This attitude is very important, as it attracts key early feedback and helps build confidence in the project. In the e-Infrastructure context, fostering collaboration from an early stage could help replace the culture of competing for funding, prompted by the Research Assessment Exercise, with a culture of jointly writing funding proposals. Promoting such an environment might encourage researchers to ask themselves what they can and cannot collaborate upon. For instance, if they are not always able to share data, they could at least consider sharing resources, such as hardware or computing power [link to relevant para in Sustainability doc].
Another feature that contributes to the success of an open source project is planning for sustainability from the beginning. Once the early versions of the project are available and potential users start showing an interest, it is critical that new members able to help with software or administrative tasks are brought on board. Although it may seem unlikely that people will just start contributing online without having met any of the team members, strong evidence from open source development shows that transparent, friendly, well-explained and well-managed projects end up attracting both users and developers. Drafting a sustainability plan and making it known to everybody also expands the chances of attracting external contributions. A plan that shows that the project team has taken into account various options and has identified a set of potential revenue streams will inspire confidence in the project's likelihood to succeed, and thus attract new members.
For e-Infrastructure service providers, this is a useful point to consider. A well-thought-out sustainability plan that considers various paths for further development - including the possibility that central funding will dry up - will make researchers more likely to invest time and effort in these services [link to relevant para in Sustainability doc]. By not fully depending on continuation funding, one sends a strong signal to all research infrastructure stakeholders that even in the event of a financial downturn, the service will continue to exist and people's contributions will remain safe.
Avoiding reliance on centrally provided developer support is also key to sustainability. Short-term developer support runs the risk of obscuring the need to build an environment attractive to new contributors. When developer support is withdrawn, the project does not have a mechanism in place for encouraging and absorbing external developer contributions. So, useful as it may be in the short term to help researchers produce project outputs, this support model provides little benefit in the long term, as it fails to create teams that are capable of producing self-sustainable products [link to relevant para in Community doc].
A more detailed account of open source sustainability lessons that are likely to benefit e-Infrastructure communities is available in a related document.
Embedding the existing technical research infrastructure in a 'human infrastructure' could help to remove some of the main barriers to adopting these tools and services by researchers. A number of open development community and sustainability lessons could be applied in this area, including creating a culture of mutual support, encouraging internal and external contributions, adapting technology to community needs, building a sustainability plan, and not relying entirely on continuation funding or central development support.
Lee C, Dourish P, Mark G (2006) The human infrastructure of cyberinfrastructure. Proceedings of the ACM Conference on Computer Supported Cooperative Work. Banff, Alberta, Canada, November. pp. 483-492.