Community lessons for research infrastructure
Despite the availability of an impressive range of online systems and resources for researchers, a recent JISC-funded Community Engagement Report has identified a number of barriers to their adoption. These include competition for research funding, an inability to share resources, and a lack of extended software development support. Let's take a closer look at some of these barriers, and explore how experience from open development could be used to alleviate or remove them.
In the UK, e-Infrastructure consists of a combination of systems, tools and services designed to foster new ways of conducting research and improve cross-subject collaboration. In spite of the development of an extremely well-populated list of such services and facilities, researchers in the UK continue to show some reserve in using them extensively. The main barriers to the use of these resources appear to be social and organisational rather than technological. We suggest that learning some lessons from open source development practice could improve the current situation.
We approach this topic in three related documents: a general discussion and two more detailed insights, which include quotes from recent interviews with UK researchers and service providers. In this document we highlight a number of community lessons that could help to remove barriers to engagement with UK e-Infrastructure systems and services.
Create a sense of belonging and a culture of mutual support
In open source, encouraging users and developers to engage with the project is crucial for building a thriving community. Building a sustainable product largely depends on forging an environment in which users and developers share a culture of mutual support and a sense of following a common goal. The effects of seeding similar ideas in research communities could be substantial.
A number of researchers and service providers report difficulties in creating an effective channel of communication between research infrastructure stakeholders. On the one hand, software developers and service providers find it difficult to grasp academic research questions or publication requirements. On the other hand, researchers may not fully appreciate the technical challenges that implementing applications present, or what distributed systems are and how they can support research activities. In the absence of mutual understanding, it is unlikely that these parties will know what to expect from each other.
As one IT service provider put it: 'If you buy a car, you know, most people know what you get and what you should look for. But if you come to us, "What can we do for you" is always vague, because we can do quite a lot, we’ve got a lot of expertise, but you know, if it was written down in a brochure it would be kind of a long-winded one, and often they don’t understand all the jargon that we have and all that sort of thing.'
One way of addressing this issue is by encouraging cross-specialist dialogue: 'There is a need for more people to sit down with scientists and work with them on their specific applications [...] people that understand both, people that can understand the applications and also understand how to grid-enable them.'
Removing the barriers associated with the use of technical or scientific jargon could help to build a sense of sharing a common overarching goal. Respondents emphasised the need to encourage the use of jargon-free language that is understood equally by researchers, software developers, service providers and other potential users. 'One of the things that’s sort of continually frustrating in the field is the assumed terminology, if you know what I mean, there’s a lot of terminology that’s come over from computing science, which was never designed for the rest of us who actually do the science [...]'
Another aspect of the current mis-communication among stakeholders concerns the generational gap between users who are differently socialised in terms of technology. Technical possibilities evolve faster than institutions are able to adapt. This means that the rate of technological change experienced by researchers in these institutions affects them differently, according to their technological capabilities. Consequently, university IT managers often find it difficult to establish a common, level ground within their institutions:
'What gets in the way of our strategy, there’s also the cultural thing that the committees of the university tend to be run by the more senior academics who all date back to the days before computers took over the world, and so talking to their research students and their junior post-docs is practically speaking a different language compared with speaking to the professors.'
Providing a sensible level of guidance on using research infrastructure services can help to foster a sense of belonging to a welcoming community. In addition, as one high-powered computing service provider said, tact and patience are needed in order not to drive away early adopters:
'We know that some people use [Condor] badly, but when we find out we talk to them and say, look, you need to understand a little bit more how this works [...] if you restructure your work like this you’ll get a much bigger benefit [...] It’s a tension between if you wanted to guarantee that your users used it perfectly you’d have to hand-hold them all the time, and they would find that unbearably restrictive and wouldn’t use it, so that would be stupid.'
Mediate and broker
Sometimes a more proactive step is needed for building community. Being able to mediate between different community members is important in this respect. One respondent described a situation where a research team may need special access to the facilities provided by the National Grid Service (NGS). In order to gain access, they would need to be recommended by fellow researchers already known to the service:
'You know, it can be hard to persuade the NGS to give you a 700 gigabyte [...] to actually get the data there. So I think you have to know the right people and sort of ask them very nicely to sort of, you know, send an email as someone who knows about these things to the NGS people, and say yes actually this is [...] a serious request, and they do need this amount of data, and actually it's really only [...] temporarily for them so they can actually get the data on there.'
These intermediaries have an important role in creating a sense of being part of a mutually supportive community. As the report authors suggest, their role in this case will involve managing the researchers' expectations (by reminding them that the NGS does not provide long-term storage for data). At the same time, they would need to advise the service provider that the requested resource is part of a genuine research requirement.
Encourage contribution, remove user barriers
In order to create a welcoming and 'want to come back' environment it is essential to encourage contribution from both inside and outside the project and remove any barriers that could prevent users' involvement in the community. Successful open source projects make it clear that all are welcome to contribute in their own way, from writing code and documentation to testing and giving feedback on new releases. A similar welcoming attitude to external contribution is likely to benefit e-Infrastructure communities too.
The report mentions the importance of highlighting situations where the use of e-Infrastructure has significantly improved the research process. These success stories may inspire or motivate other researchers and need to be widely disseminated:
'It's been interesting [...] when people start going out and about and just saying what it is possible to do, just how much excitement you can generate, so I think having stuff where you can show that people have done really new science using those tools and using jealousy [...] it seems to be working quite well in terms of getting engagement, and so the fact that, you know, we're saying that other communities just like things, like the systems biology communities are beginning to be very keen to play and join in.'
Appropriate incentives can play an important role in maximising the effect of disseminating these success stories. Some of these incentives, the report suggests, need to take the Research Assessment Exercise (RAE) into account, as this is an important motivational factor for researchers and departments. This suggestion endorses an earlier recommendation by the Century of Information Strategy report:
'The funding bodies should make the award of grants and the evaluation of their outputs include a significant element of contribution to education in the domain of the research.'
Another important lesson from open source development is that contribution from external users and developers should be encouraged and rewarded. Most open source project users are passive in terms of their interactions with the community. However, when they take on more active roles, for example by reporting bugs, helping other users or writing documentation, sustainable and well-run open source communities have in place mechanisms for encouraging and rewarding their efforts. In many instances, they are offered additional access to, or control over, the project. Clear documentation explaining how one can move from passive user to contributor, then to senior contributor with commit rights, and eventually to decision-making board member, will be in place. This means that everyone knows what to do if they want to increase their role and responsibility in the project.
In e-Infrastructure environments, the first step from user to contributor often involves providing feedback on the functionality of the product being developed. They may, for example, report a bug or notify a service that something is not working as it should. The key thing here is to make sure that, when users decide to take the trouble to provide such feedback, the path is as straightforward as possible. Acknowledging contributions immediately and addressing issues in a timely manner are equally important for the future engagement of these users.
'In the case of individual surveys, there is sometimes the case that I have downloaded a file and then found out perhaps an error within the data file, or lack of priority on a particular variable, or a variable missing within the data file that’s referred to in the documentation, and occasionally I have reported that sort of example to the help desk, and they’ve usually been, well in fact pretty much every time, they’ve been able to get back and come up with the solution quite rapidly, so I find them a very helpful service in that respect.'
Create easy ways in
Another barrier to user engagement concerns a lack of or poorly structured documentation. One researcher mentioned his repeated attempts to learn how to use a research infrastructure tool: 'Essentially it was through searching for handouts and searching for presentations that were scattered about over various websites.'
Other respondents also mentioned the need for training opportunities that would provide them with the necessary skills for using the service: 'I can see that there are things there which we probably could be able to use in the future, but first we’d have to work out how, if you know what I mean? There are projects, for example like OGSA-DAI, and OGSA-DAI has some features which I can see they would be useful if we had skills, or if we had the time to actually be able to get far enough into the technology to be able to actually utilize it properly.'
Some e-Infrastructure providers are aware of these issues and try to address them, but they are concerned with the potential costs involved. One of them suggested that providing excellent documentation could be a way of avoiding the cost of deploying and updating training solutions:
'If you want a wider benefit then you would have to make it easier for them to use, not just keep on training them, because you have to not just consider the cost of training, but also the cost of them not being engaged for a long time with their own stuff because they’re being trained to use this, and of course e-infrastructure will change again and then they have to be re-trained. So what you want to do is make it so easy for them to use that you don’t need training.'
Adapt tools to community needs
Another lesson that e-Infrastructure stakeholders can learn from open source communities is the importance of designing or requesting systems and services that fit their community needs. These should help users ask their research questions and carry out research in an efficient and consistent manner. At the same time, they should be easy to adapt to the specific needs of the various research teams.
One of the main aspects revealed by the report is that more attention has been paid to developing e-Infrastructure technology than to understanding the social aspects associated with its use. The implication is that the entire process of building these systems and services may need to be reconsidered. Instead of building tools thought to be useful to the research communities, and struggling afterwards to make researchers use them, a more sensible approach would be to involve researchers in the very process of designing and building them. This would ensure that what is being built fully suits their needs.
One problem with the current approach flagged by several arts and humanities researchers concerned the e-Science paradigm itself, which may be inappropriate for supporting the types of questions usually asked in their disciplines: 'Being able to fit an e-Science paradigm into Arts and Humanities is the problem rather than whether we can use the technology.' In other words, e-Infrastructure tools may not be discipline-neutral, and their use in some research projects may modify the established research practices and negatively influence the research results. Consequently, their adoption by the respective researchers may be compromised.
'There are fundamental barriers to using these technologies for the Arts and Humanities, but that’s to do with the discipline [...] rather than [with] not understanding the technology, not being able to get the technology. It’s a matter of what can we do with these technologies which are useful, which are methodological questions, and also [...] about what is Arts and Humanities, what kind of research are we doing?'
People over technology
Understanding the aims of the community and choosing technology accordingly is crucial for the success of open source projects. In some open source environments, this issue goes beyond the management of project tools and affects the very processes of contributing software code. 'Look after the community and the code will look after itself' is how members of The Apache Software Foundation memorably put it. Having appropriate processes in place for building the community creates a sustainable way of developing the software itself.
In many cases, adopting technologies that fit the needs of the community involves using project tools, such as version support systems, bug trackers, mailing lists, and so on. Research communities are likely to benefit by following open development practices like these. Indeed, most issues flagged by the report respondents concerned practical aspects of engaging with technology. For instance, one of them mentioned difficulties in engaging with different configurations of the NGS nodes, which resulted in the need to compile code repeatedly:
'So the problems we faced from the user’s point of view, I guess the biggest problem, is that the machines in terms of libraries and compiler versions aren’t kept in sync. So it’s not – MPI libraries and so on – it’s not always possible to compile on one machine, especially MPI executable, and then take it to another of the NGS machines and run it, and this is a significant problem because it means that [...] where you use multiple machines you have to log into each one in turn and compile your code, which takes time, and the more grid resources you have, the more time that takes.'
A proper mechanism for keeping different software versions in sync is crucial. In this respect, the open development experience of distributed teams working on subsequent versions of software code can be useful. Similar solutions could be devised for streamlining the process of synchronising the NGS nodes, for example. The key is the ability to coordinate people and processes that affect the performance of these nodes, rather than necessarily re-designing the technology itself.
Another issue mentioned in relation to the NGS concerns the degree to which research infrastructure software allows users to develop applications or extensions on top of existing ones.
'[...] what I would like to do is build applications that use the NGS underneath, and the NGS has the wrong kind of interface at the moment, so it’s difficult for me to build those applications [...] it’d be useful if the interface was a service, at the moment it’s a log-on, I have to log on to the system and type my own commands.'
A number of open development community-related lessons could help to remove barriers to user engagement with e-Infrastructure tools and services. Researchers are more likely to use these facilities if they feel that they are part of a culture of mutual support and collaboration, where contribution is encouraged and rewarded from an early stage. They are also more likely to engage with technological solutions that are adapted to fit the real needs of the community.
Related information from OSS Watch
- [researchinfrastructure.xml Open Source and Research Infrastructure]
- [odm.xml Avoiding abandon-ware: getting to grips with the open development method]
- [sustainableopensource.xml Sustainable open source]
- [howtobuildcommunity.xml How to build an open source community]