Brede Tools and Federating Online Neuroinformatics Databases

As open science neuroinformatics databases the Brede Database and Brede Wiki seek to make distribution and federation of their content as easy and transparent as possible. The databases rely on simple formats and allow other online tools to reuse their content. This paper describes the possible interconnections on different levels between the Brede tools and other databases.


Introduction
The concept open science entails the free access to data and methods.In neuroinformatics open science would allow federation of databases, so researchers can make queries across data sets and ontologies.
Open science represents the first step enabling data sharing.As the next step we would like to query across the different databases, and in a further step we would like to use the data across multiple databases in statistical computations so metaanalytic consensus emerge.For neuroinformatics these last two steps are hindered by the plethora of different data formats, brain atlases and terminology and the division of data between several databases, -making even the discovery of relevant resources difficult.We would like the databases to expose their data in both human-readable and machine-readable format.With a machine-readable format neuroinformaticians can work with data en masse and merge the data across databases.However, even with open science data in a machine-readable format one still has to match and link heterogenous data in different formats.This is where integrative neuroinformatics tools come into play.For information retrieval these tools should have an understanding of concepts rather than just keywords (Gupta et al, 2008).
Several tools have been described for integrating or federating neuroscience databases (Gupta et al, 2008;Ashish et al, 2010;Cheung et al, 2009).One of the major database federation efforts is Neuroscience Information Framework (NIF) that uses the Neuroscience Information Framework standardized (NIF-STD) ontology (Bug et al, 2008).With this ontology NIF performs term expansion from a user query.The expanded query is translated to queries for the different source databases and through a data mediator the queries are sent to these external databases and aggregated.The system relies on tools, e.g., for registering the schema of the external database and for full text search.The complete system may search on web pages, databases, Extensible Markup Language (XML) and other documents.
One way of database federation is by the socalled Semantic Web.First prominently described by Berners-Lee et al (2001) its community has now established a number of technologies around the concept, e.g., triple stores, Resource Description Framework (RDF), Notation3 (N3) and the Web Ontology Language (OWL).Building and using actual Semantic Web databases has gained momentum especially with the so-called Linked Data approach, and the Linking Open Data cloud (http://lod-cloud.net)organizes many open data sets.IBM Watson DeepQA has recently shown the powerful applicability of the Semantic Web as a part in a system for general question answering in the Jeopardy televised game where the system operated on "structured and semistructured knowledge available from for example the Semantic Web" (IBM, 2012).Among more than 100 techniques DeepQA relied on the Semantic Web resources DBpedia and Yago ontology (Ferrucci et al, 2010).Widespread use of Semantic Web technologies have not yet taken a strong foothold in neuroinformatics, but some projects have used it, e.g., Cognitive Paradigm Ontology (CogPO), based on the Brain-Map taxonomy, expresses its components in the Semantic Web OWL format (Turner and Laird, 2011).
One of the software packages that allows users to collaboratively construct Semantic Web resources online is the Semantic MediaWiki extension that extends MediaWiki-based wikis, so wiki links can be typed and semantic information queried (Krötzsch et al, 2006).Further extensions on top of Semantic MediaWiki allow for, e.g., form-based input and import and export of data in comma-separated values.Semantic MediaWiki has been embraced in neuroinformatics with NeuroLex and ConnectomeWiki (Gerhard et al, 2010).
The Semantic Web is no silver bullet for neuroinformatics.Data may not align well with the framework provided by the Semantic Web.For instance, central to neuroimaging is the volume files with associated image processing and statistical analysis.Such files can be linked and described by the Semantic Web, but it does not make specialized data analysis methods or image-based queries available.In some cases neuroscience data are conveniently described in tabular form and indeed many current neuroinformatics databases use relational databases where mature database engines provide complex query facilities.Sufficiently complex queries required for scientific data may simply not be available in the Semantic Web query language SPARQL (Gray et al, 2009).Furthermore, for neuroimaging results we found that Semantic MediaWiki does not provide sufficient functionality so the statistical analysis for meta-analysis can be implemented (Nielsen et al, 2012).
The following sections first introduce the Brede Database and Wiki and then describe how the neuroinformatics databases link up with other databases.

Brede tools
The Brede Database (http://neuro.compute.dtu.dk/services/brededatabase/)(Nielsen, 2003) contains data from 186 published neuroimaging articles that include stereotaxic coordinates.In a fashion inspired by the BrainMap database, the Brede Database structures data from each article into one or multiple experiments, that each may contain one or more stereotaxic coordinates.There are 586 experiments, and the Brede Database is supported by simple ontologies for topics, brain regions, journals and persons.The purpose of the Brede Database is to provide an open science data source for visualization and meta-analysis in neuroimaging.The Brede Wiki (http://neuro.compute.dtu.dk/wiki/)(Nielsen, 2009a) has a broader scope than the Brede Database, not only recording studies with stereotaxic coordinates, but also brain morphometry and personality genetics studies as well as studies outside neuroscience.It contains descriptions of 869 journal papers and 174 conference papers, 129 pages with stereotaxic coordinates, 479 pages each describing a brain region and 599 pages describing a 'topic'.In total the wiki has presently 3,291 content pages (This number can be compared with the 7,555 content pages in NeuroLex).Apart from providing open science data, the purpose of the Brede Wiki is also to provide a more direct way to contribute data and ontology information, to be able to handle many different forms of data (not just stereotaxic coordinates) and to provide means for freeform textual annotation of scientific publications.The Brede Wiki for Personality Genetics (Nielsen, 2010) has data from 87 published personality genetics studies with information about personality scores, genotype and subject group.Its purpose is to perform a mass meta-analysis of personality genetics and to provide a testing ground for an open science collaborative online "real-time" mass meta-analysis with visualization.Kötter (2001) laid out the main points to consider in the construction of neuroscience databases.The following paragraphs will use these main points to describe the Brede Database and Brede Wiki.
Data acquisition: Results from published neuroimaging experiments are manually entered in the Brede Database via a graphical user interface implemented in Matlab and available via the Brede Toolbox.Other users can download the toolbox and enter data for inclusion in the Brede Database.In Brede Wiki registered and "anonymous" wiki users can enter data and text directly in the standard wiki interface.To aid data entry, web services can extract bibliographic information from PubMed and extract part of the brain coordinate information from scientific Portable Document Format (PDF) documents by converting the PDFs to text and matching brain coordinates with regular expressions.It is a method related to the extraction method in connection with the NeuroSynth database (Yarkoni et al, 2011).To handle the slight differences in stereotaxic coordinate spaces the Brede Toolbox transforms brain coordinates to the Talairach atlas during data entry, storing both the original reported and the transformed coordinates in the Brede Database.For the Brede Wiki the original reported coordinates are entered along with a field indicating the stereotaxic space.A template in the Brede Wiki creates links to these two web services based on the PubMed identifier and a link to an accessible PDF.The Brede Wiki for personality genetics provides online form-based input for data entry and may export its data in the MediaWiki template format for inclusion in the standard Brede Wiki.
Data quality control: Brain coordinate information is double checked during data entry.For the Brede Database an algorithm may detect outliers by statistical modeling of the 3-dimensional distribution of brain coordinates conditioned on the neuroanatomical label.One can sometimes trace the detected outliers to entry errors either in the database or in the original paper (Nielsen and Hansen, 2002).Vandals corrupt from time to time the open Brede Wiki.However, the standard vandal fighting mechanisms in the MediaWiki software and aggressive blocking maintain this problem on a minimum.So far the vandals usually write on new pages and have not corrupted genuine content.

Data representation:
The Brede Database stores content in a so-called poor man's XML format, a hierarchical XML structure that contains no attributes and no empty tags.
The Brede XML files function as the "back-end" of the Brede Database.We convert parts of the central XML file (wobibs.xml) to a SQL representation for faster coordinate-based search.In that application a single table represent a stereotaxic coordinate in each row.The MediaWiki software features so-called templates as a standard part.The Brede Wiki stores structured data, such as bibliographic information, ontology information and brain coordinate information, in simple-formatted MediaWiki templates.It allows relatively easy extraction of the data (Nielsen, 2009a).The Brede Wiki can store comma-separated values content in the standard wiki pages.A major inclusion of this kind of content comes from a collaboration with Matthew Kempton after his and his group's large meta-analyses of structural neuroimaging in mental disorders (Kempton et al, 2008(Kempton et al, , 2010(Kempton et al, , 2011)).The Brede Wiki can also store "multimedia files" such as summary neuroimages, e.g., contrast and residual neuroimages as Neuroimaging Informatics Technology Initiative (NIfTI) files.

System implementation:
The philosophy behind the Brede Database and Wiki of using simple and standardized components makes it easy to migrate and distribute.We use standard free software for the online components of the Brede tools: Apache, MediaWiki, Perl and Python running on a Debian/Ubuntu computer.The Brede Wiki runs with a standard MediaWiki implementation (http://www.mediawiki.org/)supported by a few extensions available online, e.g., one to render geographical maps via OpenStreetMap and another one to format comma-separated values data in a sortable table.
User interface and documentation: The Brede Toolbox provides a commandline-(CLI) and graphical user interfaces (GUI) to the Brede Database.The Brede Toolbox has standard Matlab headers with documentation.The wiki maintains help pages.
Tools for data analysis: Matlab functions from the Brede Toolbox can extract coordinates and other content from the Brede Database and Brede Wiki and it can perform coordinate-based meta-analysis and text mining.The Brede Wiki for personality genetics has hard-coded standard meta-analysis methods for online mass meta-analysis.The Brede Wiki enables online meta-analysis of content in comma-separated values format (Nielsen et al, 2012), e.g., enabling an online version of the mass meta-analysis by Kempton et al (2011).
The meta-analysis associated with the Brede Wiki operates with standard (single parameter) metaanalysis methods with effect sizes based on standardized mean differences, logarithmic odds ratios and logarithmic variance ratios (Hartung et al, 2008;Shaffer, 1992).The meta-analysis webservice reports results in tables, forest and funnel plots.Furthermore the webservice can export both data and results in machine-readable format, enabling users to query for multiple meta-analytic results and perform a mass meta-analysis in their own environment.The mass meta-analysis webservice with the Brede Wiki queries the meta-analysis webservice multiple times to form a L'Abbé-inspired plot of multiple meta-analytic effect sizes and their uncertainties as well as showing these effect sizes in a table sorted according to P-value.Brede Wiki pages on, e.g., obsessive-compulsive disorder and amygdala feature examples of brain volume mass meta-analysis.
The Brede Wiki meta-analysis cannot yet perform coordinate-based or image-based meta-analysis.
Budget and maintenance: The Brede Wiki and Brede Database run from a single computer at the department level, administered and maintained by the author.The Brede system has presently no budget associated per se.By using Open Source software and commodity hardware the cost can be maintained at a minimum.
Legal and ethical issues: We release the databases under copylefted share-alike licenses, see also (Miller et al, 2008) for a discussion of a license for databases.
Licensing for open science data has generated discussion concerning attribution and share-alike aspects.Some open science proponents argue that open data should be public domain or (for jurisdictions where databases fall under suis generis rights) the data should be licenses under Creative Commons Zero (CC0) license (Science Commons, 2007).Other open data proponents advocate that open data should be licensed under condition of attribution and sharealike (e.g., CC BY-SA and Open Data Commons Open Database License, ODbL).
The CC0 side argues that attribution-requiring licenses (e.g., CC BY) result in the problem of "attribution stacking" placing a burden on the largescale database integrator when individual attribution of possible many thousand data providers is needed (Science Commons, 2007).However, pragmatic solutions exist with attributing contributors as a group or linking to a page with contributors listed (Fitzgerald, 2009).Potential incompatibility problems between two share-alike licenses have in practice not been a problem: Both Wikipedia and OpenStreetMap have changed share-alike licenses.Share-alike ensures subsequent "sharing back" and mutual sharing (Open Knowledge Foundation, 2009;Nielsen, 2009b), and this is why the Brede systems are under share-alike licenses.In Linked Data a commercial entity can link to Brede data.An entity incorporating an ODbL database in a collective database does not need to put the collective database under ODbL, -only the ODbL part.
The databases does not contain raw imaging data or large genome-wide scans, thus it has no privacy issues in that respect.The Brede Wiki describes researchers using only public professional CV information.
Comparison and federation: Federation is described in depth in the following sections.
Impact and significance Antonia Hamilton has included data from the Brede Database in her AMAT database (Hamilton, 2009) and SumsDB (Van Essen, 2009) includes this data.Also NIF includes data from the Brede Database.The web server and wiki maintain simple web access statistics but bots severely affect the results making it difficult to say how widely the Brede Wiki and Database get used.

Linking bibliographic information
Most neuroscience data are associated with a specific publication, thus it is natural to navigate based on the bibliographic information when a neuroscientist seeks data.The central part of the Brede Database has the publication at the top level, and in the Brede Wiki some of the data are stored on the wiki page for the publication rather than a separate page.
The scientific article provides the perhaps most straightforward means for federation.PubMed indexes most neuroscience articles, and other databases can use its integer PubMed identifier (PMID) to uniquely identify articles.
If neuroinformatics databases use PMID as a key for papers it will allow other web services to easily predict the identifier for data items.However, not all papers have PMIDs.Of the 87 papers in the Brede Wiki for personality genetics 4 papers did not have a PMID at the time of entry.The Brede Wiki that has a wider scope than biomedical journals (incorporating, e.g., psychological and web science papers) records presently 869 journal papers where 200 have no PMID.Most of these 200 papers come from journals not indexed by PubMed.The Brede Database uses its own unique identifier for each scientific paper while recording the PMID.Like AcaWiki (http://acawiki.org)and WikiPapers (http://wikipapers.referata.com)the Brede Wiki uses the title of the scientific paper as title for the wiki pages.
The web service Citedin (http://www.citedin.org)from Department of Bioinformatics of Maastricht University searches several online databases based on PMID and aggregates the search results on a single page.The service searches, e.g., AlzForum and Mendeley.It also searches the Brede Wiki by querying the MediaWiki API with the PMID integer letting the MediaWiki API return the result set in JavaScript Object Notation (JSON).A Citedin search with, e.g., the PubMed ID 20537866 (Kempton et al, 2010) will list that Brede Wiki and Mendeley (https://www.mendeley.com)each have an entry for the paper and that no other of the many bioinformatics resources searched has its entry.
The LinkOut facility in Entrez also uses PMID to link together databases.The NIF Extensible Web resource DISCOvery, registration and interoperation framework (DISCO) can setup the Link-Out via XML files (Marenco et al, 2010).Via the DISCO setup PubMed links to experiments in the Brede Database on PubMed abstract pages under the item "LinkOut", e.g., the PubMed entry for the article "Command-related distribution of regional cerebral blood flow during attempted handgrip" (PMID: 10066691) links to the web pages for the 3 experiments in the paper, the first one located at http://neuro.compute.dtu.dk/services/brededatabase/WOEXP47.html.
Where possible the pages for papers in the Brede Wiki may link to the more general bibliographic databases PubMed, PubMed Central, Microsoft Academic Search, CiteULike, arXiv, SSRN and the publishers website through Digital Object Identifier (DOI) as well as the neuroinformatics databases Brain Operation Database (BODB), fMRI Data Center, Internet Brain Volume Database (IBVD, http://www.cma.mgh.harvard.edu/ibvd/),OpenfMRI Database, SumsDB and the Brede Database.The Brede Wiki may also record equivalent identifiers for items in the BrainMap database, but the Brede Wiki can not establish deep links to BrainMap.Except for DOI which PubMed makes available the wiki user needs to enter the identifier for deep linking manually into a field in a MediaWiki template.
On the Semantic Web the Dublin Core metadata can describe information resources, such as a scientific paper.However, Dublin Core targets general information resources and not just scientific articles so the Dublin Core has properties for title, author and date of publication, but not for journal title and journal issue.Other metadata standards need to be invoked to describe such bibliographic datails.This could, e.g., be the Bibliographic Ontology (http://bibliontology.com/)or the Publishing Requirements for Industry Standard Metadata (PRISM) standard (International Digital Enterprise Alliance, Inc., 2008).
Linking on the publication level provides means to determine if publications are available in other databases.However, it provides only a starting point for data federation as the individual databases typically represent data in the publications in different ways.

Linking experiments
The experiment is typically the entity that is relevant for meta-analysis.A good description of experiments in a database allows for automated "bottomup meta-analyses" where an automated selection of studies is based on independent annotation of each study, rather than a manual search and selection of elegible studies by an expert.The annotation in the Brede database has allowed such bottom-up mass meta-analyses (Nielsen, 2005(Nielsen, , 2009c)).Linking to experiments in other databases might also be relevant for the meta-analyst: Perhaps the data in the original database is not complete or not in an appropriate format, while the linked database presents relevant data.For instance, the Brede Database provides stereotaxic coordinate data for a few papers that are also present in fMRIDC.By linking to fMRIDC the meta-analyst is alerted about data for possible image-based metaanalysis.
The BrainMap used its own taxonomy for describing an experiment (Fox et al, 2005).The Brede Database uses part of this taxonomy as well as an ontology for topics, the so-called "external components" (external from the neuroimage) to annotate an experiment.These topics get connected in a directed acyclic graph.The Brede Wiki has also a topic ontology and a curator may annotate experiments in the Brede Wiki with both Brede Wiki and Database ontologies.Both ontologies link to Medical Subject Headings (MeSH), and the Brede Wiki ontology furthermore links partly to NeuroLex, Wikipedia and the Brede Database.Brede Wiki has a few deep links to the topic ontologies of the newer efforts in experiment annotations, CogPO and the Cognitive Atlas (Turner and Laird, 2011;Poldrack et al, 2011), Individual experiments in the Brede Database have unique identifiers, while the Brede Wiki records the experiments on the same wiki page as the wiki page describing the scientific article.However, the Brede Wiki may still link, e.g., to corresponding data in BODB ("Summaries of Empirical Data", SEDs) and the Brede Database.Some automated mass meta-analyses have relied on individual words (tags) extracted from the articles to describe the experiments (Nielsen et al, 2005(Nielsen et al, , 2006;;Yarkoni et al, 2011).Individual words are not per se tied to a semantic structure but researchers outside neuroinformatics have developed methods to yield "emergent semantics" from tagging systems by measuring the generality of tags (Benz et al, 2011).
In the Semantic Web, ontologies such as EXPO may describe scientific experiment in general, while domain specific ontologies may be used to describe the details in their respective scientific area (Soldatova and King, 2006).
To describe experiments the Brede ontologies have not as deep coverage as BrainMap, CogPO and the Cognitive Atlas.However, the level of interconnection of Brede is large compared to other databases.

Linking brain regions
Both the Brede Database and the Brede Wiki have brain region ontologies.They organize the brain regions in a directed acyclic graph with links to broader and narrower regions (parent and child brain regions), include naming and abbreviation variations, i.e., one canonical name and possibly multiple variations and abbreviations.Brede Database defines 763 brain regions, while Brede Wiki presently has 478 brain regions defined.A specific purpose of the Brede brain region ontologies is to handle the brain anatomy annotation associated with stereotaxic coordinates in neuroimaging research.
The ability to recognize names for brain regions from an unstructured text such as a scientific abstract is important for text mining and automated population of databases.The recognition methods may start with a dictionary of neuroanatomical terms and then use simple dictionary matching.More advanced methods use statistical natural language processing technique (French et al, 2009).The brain region ontologies of Brede Database and Brede Wiki record variation on naming and abbreviations so they can act as a basis for text entity recognition.The anatomical labeling of individual coordinates in the same brain region may differ, and the listing of naming variation helps to achieve higher recall rates, e.g., in connection with the automated meta-analysis where automated algorithms extract coordinates from a database based on neuroanatomical label (Nielsen et al, 2006).
Both Brede brain region ontologies link to some of the digital atlases.The Brede Database ontology links to all 116 labeled brain regions from the Automated Anatomical Labeling (AAL) atlas (Tzourio-Mazoyer et al, 2002), 35 regions links to the regions from PVELAB (Svarer et al, 2005) and 43 regions links to the Hammers atlas (Hammers et al, 2002).The Brede Wiki links its brain regions to some of the brain regions in the Brede Database, IBVD, Brain Architecture Management System (BAMS) (Bota et al, 2005), NeuroLex and BrainInfo neuroinformatics databases as well as MeSH and digital atlases AAL, the Hammers atlas, the 69 regions from the Harvard-Oxford Atlas (http://www.fmrib.ox.ac.uk/fsl/data/atlasdescriptions.html) and the 56 regions in LONI Probabilistic Brain Atlas (LPBA40) (Shattuck et al, 2008).The manual hard-linking to digital human brain atlases may serve as an alternative to the data-driven atlas reconciliation of Bohland et al (2009).
The Brede Database also deeplinks to "sites" in the CoCoMac database (Kötter, 2004).Matching brain areas between Brede Database (which has its basis in human neuroimaging) and CoCoMac (with its macaque tracing studies) is difficult and the validity of the links has not been closely examined.
The CoCoMac database has an elaborate system for specifying anatomical terms based on the definition and delineation of different authors.The Brede systems do not distinguish between different definitions and delineations but instead try to reconcile to a consensus.Parcellation between different brain atlases and other neuroinformatics resources may differ and the Brede systems do not provide a way to state these differences.
Contrary to the case with individual publications and experiments Wikipedia has a fairly extensive collection of wiki pages for brain regions with each region being linked up with several neuroinformatics resources.As DBpedia extracts the information on these Wikipedia pages it means that a central resource in the Linked Open Data cloud is a Semantic Web hub for brain regions.
BrainInfo and NeuroLex are larger than the Brede brain region ontology.As Brede ontologies were primarily targeted neuroimaging Brede describes many larger brain areas, while many smaller brain areas (typically not reported in neuroimaging studies) are not described in the Brede systems.
Somewhat special for brain region ontologies, Brede has separate entities for left and right brain regions.This may be an advantage in human brain mapping if, say, the functions of left inferior frontal gyrus are different from the functions of the right inferior frontal gyrus.A further element of the Brede brain region ontologies is the extensive linking to brain atlases providing a hook to the threedimensional brain space.
As a small test of data integration the overlap in brain regions among the Brede Wiki, NeuroLex and BrainInfo was examined to seek out missing links or inconsistensies among these databases.From the SQL database of Brede Wiki we identified 44 of a total of 467 brain regions where the Brede Wiki had deep links to both NeuroLex and BrainInfo.By a semantic query on NeuroLex in the categories "Regional part of nervous system", "Parcellation scheme parcel" and "Superficial feature of brain" we downloaded 1505 brain regions in a comma-separated values file.Of these 1505 brain regions 645 have a BrainInfo identifier.A test on the 44 Brede Wiki brain regions identified 5 mismatches in the Brain-Info information between Brede Wiki and NeuroLex and 4 brain regions with apparently missing Brain-Info identifier in NeuroLex.The four were hippocampus, corpus callosum, dentate gyrus and lateral ventricle.For hippocampus the Brede Wiki links to the "CA fields" of BrainInfo, corpus callosum is linked to item 173 in BrainInfo, dentate gyrus to 161 and lateral ventricle to 191.The 5 mismatches were (in Brede Wiki): Brodmann area 13, Brodmann area 18, inferior frontal sulcus, middle temporal sulcus and supracallosal gyrus.Brodmann area 13 is a clear typo in Brede Wiki.The BrainInfo identifiers for Brodmann area 18 and middle temporal sulcus were also typos.Inferior frontal sulcus is an error in Brede Wiki stemming from the problem of handling overlapping identifiers in BrainInfo: 64 (type h) is "lateral orbital sulcus" while 64 is "inferior frontal sulcus (human)".The same error occurs for supracallosal gyrus.These errors in the Brede Wiki are now corrected.

Linking coordinates
Stereotaxic coordinates form basic data for most neuroimaging meta-analyses.The Brede system records this kind of data for the purpose of meta-analysis as well as to allow researchers to do "image-based" queries, e.g., to get nearby coordinate to a query coordinate, so they can put their results in the context of previous research.As the number of coordinates in the Brede systems are limited it is relevant to query other databases for further coordinates.
The Brede Database provides no outbound links, -other than to its own coordinate search facility.However, the Brede Wiki has, apart from links to coordinate search in the Brede Wiki and Database, outbound query coordinate links to the SumsDB and NeuroSynth coordinate databases, i.e., for each 3dimensional coordinate encoded with the relevant MediaWiki templates in the Brede Wiki the template creates links that search for nearby coordinates in SumsDB and NeuroSynth.
Previously the Brede Wiki could use the presently unavailable web services ICBM View from McConnell Brain Imaging Centre and the INC Interactive Talairach Atlas from University of Minnesota for rendering stereotaxic coordinates online.With links setup via a Brede Wiki template both these web visualization services would then render a single coordinate on a background of a brain atlases when a wiki user clicked the link.With these services unavailable the Brede Wiki lacks online visualization of stereotaxic coordinates, though the Brede Toolbox can still render coordinates from the Brede Wiki.
The Brede systems have far less coordinates than NeuroSynth, BrainMap and SumsDB.However, the Brede systems implement web-based query interfaces which expose their coordinates in a machine-readable format.

Other content
Apart from the above named components the Brede Database and Wiki have structured content for researchers, organizations, journals and the Brede Wiki furthermore for events.These components get less linked as other neuroinformatics databases usually do not describe them in detail.The 130 researchers in the Brede Database have each a unique integer identifier.The Brede Wiki with 684 researchers uses instead the personal name as the wiki page identifier and may furthermore link to Neurotree, Wikipedia, Twitter, Microsoft Academic Research, Google Scholar and the Brede Database.A promising global system for uniquely identifying researchers exists with Open Researcher & Contributor ID (OR-CID) (Fenner, 2011).After the launch of ORCID we have added these identifiers to the Brede Wiki.
Organizations get connected in a hierarchical structure in the Brede Wiki and may link to Wikipedia and Microsoft Academic Research.Journals may link to the NLM Catalog, Microsoft Academic Research and Wikipedia.The information for journals also records naming variations, which was utilized in an analysis of scientific citations in Wikipedia (Nielsen, 2007).
Typical neuroinformatics databases represent authors as just keywords without trying to resolve the name to a unique researcher.Although the author resolving is not complete, the Brede systems provide a framework to uniquely represent authors and tying them up against affiliation and external identifiers, e.g., as provided by Google Scholar.Linking researchers, publications and affiliations will allow other researchers to perform more precise social network analysis and bibliometrics.The Semantic Web has the Friend of a Friend (FOAF) vocabulary to describe people.

Data access
Newer open online federating databases, such as DBpedia, provide data items both as human readable web pages and as individual pages in a machine readable format such as JSON, N3 or RDF as well as in a form for download of the complete data sets, e.g., http://downloads.dbpedia.org/.
The Brede Database provides automatically generated but static human readable web pages for the database content.We also distribute the complete data set in multiple XML-files: Separate files exist for: Experiment data (data from papers with Talairach coordinates, wobibs.xml),brain region ontology (worois.xml),topic ontology ("external components", woexts.xml),researchers (wopers.xml),orga-nizations (woorgs.xml)and scientific journals (wojous.xml).The query interface for coordinates may serve human readable HTML as well as machine readable XML and comma-separated values files in connection with each query.An SPM plugin uses the machine readable content output for queries against the Brede Database from within this analysis program (Wilkowski et al, 2009).Access to the complete Brede Database in a machine readable XML format has enabled federation of Brede Database into the Neuroscience Information Framework (NIF) and BODB.The Brede Database has provided no special level of access for these integration efforts, -other than providing the XML files available for everyone.The NIF federation means that Internet users can search for brain coordinates in the Brede Database via the "Brain Activation Foci" search interface in http://neuinfo.org.
The Brede Wiki also makes the data available in several ways.The standard MediaWiki interface presents formatted readable and editable wiki pages.But the MediaWiki software may also serve the wiki content in other formats with the API (http://www.mediawiki.org/wiki/API:Mainpage).One may, e.g., query for the name of all pages tagged to a given category or all pages using a specific MediaWiki template and get the result back in JSON.MediaWiki has scripts to dump all pages of a wiki to XML.Like Wikipedia the Brede Wiki provides this XML dump (http://neuro.compute.dtu.dk/services/bredewiki/download/).A script also extracts the structured content in the Brede Wiki encoded with the MediaWiki templates, and we also make this available online as a SQLite file.The coordinate search facility in the Brede Wiki makes use of the SQLite file.The Brede Wiki also distributes a SKOS file for the Semantic Web (Miles and Bechhofer, 2009), where the URIs resolve at the prefix http://neuro.compute.dtu.dk/resource/.However, the items in the SKOS description does not yet link to other Linked Data Semantic Web resources (Berners-Lee, 2009).An OWL file is not yet available for the Brede tools.Presently a lookup on, say, http://neuro.compute.dtu.dk/resource/Amygdalaresults in a redirect to the Brede wiki for the term: http://neuro.compute.dtu.dk/wiki/Amygdala.The plan is that the resource in the future should perform content negotiation and return data in the appropriate format, e.g., RDF or JSON, so the Brede system can join the linked data cloud.Semantic Web datasets may be published in these specialized file formats, but microformats present another way of publishing semantic information, where structured metadata are embedded in ordinary HTML web pages in a machine-readable form.The Brede Wiki represents bibliographic information in the Contex-tObjects in Spans (COinS) microformat constructed via a MediaWiki template.It should allow, e.g., general Internet search engines, to discover that some of the pages on the Brede Wiki describe specific publications.
The Brede Wiki has not yet added Semantic Medi-aWiki functionality (Krötzsch et al, 2006).Enabling this would allow for more complex queries than the queries possible with the standard MediaWiki API.However, the MediaWiki template format used in the Brede Wiki has a relatively simple format enabling easy definition of semantic links.The Wikidata system under development may possibly also yield more advanced query functionality (Vrandečić, 2012).
A recent master thesis from the Technical University of Denmark describes one of the few examples of meta-analysis with multiple database federation (Sigurdsson, 2012).The system federates brain region descriptions from the Brede Wiki and Database, the cognitive terms from CogPo and Cognitive Atlas and brain coordinates from SumsDB (which includes AMAT and Brede Database data) for data mining topics over multiple brain regions with an approach similar to Nielsen et al (2006).

Discussion
Larger scale and more intimate sharing of data and metadata will lead to higher levels of data integration, and it is the hope that this will lead to neuroscience discovery.What weighs against federating neuroinformatics database?Database federation seems not to have generated a significant number of neuroscientific articles that take advantage of the federation, at least not in macroscopic neuroscience, such as neuroimaging.
Maintaining a neuroinformatics database requires continuous support.Database ownership has importance for grants.By "owning" the data in the database the administrator can ensure proper attribution when used, e.g., through co-authorship or at least citations.As initiated more than 20 years ago BrainMap has had the longest experience with neuroinformatics databasing (Fox and Lancaster, 1994), and as such provide one of the best observations on "how to run a neuroinformatics database".After the emergence of coordinate-based meta-analysis methods (Turkeltaub et al, 2002;Nielsen and Hansen, 2002), the implementation of a meta-analytic tool aligned with the database and further population of the database, BrainMap has now served as database for over 100 meta-analytic studies (Laird et al, 2011).BrainMap has achieved this number without any significant database federation.
No testing of Brede Wiki with the Semantic Me-diaWiki extension has been done.In a separate project we are currently performing a systematic review of the Wikipedia research literature (Okoli et al, 2012) and use the Semantic MediaWiki WikiLit (http://wikilit.referata.com)to keep track and annotate scientific articles, -a system inspired by WikiPapers.Testing WikiLit shows that Semantic MediaWiki is a convenient tool to represent, annotate and query bibliographic information about scientific articles.Semantic MediaWiki also works well when defining ontologies.However, it is less clear how convenient Semantic MediaWiki is for individual scientific data from scientific articles, e.g., stereotaxic coordinates, brain volume measurements and personality genetics data.The issue on representing brain volume data with a Semantic MediaWiki was discussed by Nielsen et al (2012).There exist Semantic MediaWiki extensions that enable representation of n-ary data (e.g., x, y and z for stereotaxic coordinates) so multiple data items can be present on one wiki page, e.g., the Semantic Internal Objects extension.Otherwise each data item needs its own wiki page.Until now we have continued to stay "compatible" with Wikipedia by only using the standard template facility in MediaWiki.We are still exploring whether Semantic MediaWiki would be suitable for our kind of scientific data.We have focused on meta-analyses, and Semantic MediaWiki does not provide functionality for the computational part of meta-analysis, so we selected the simplest format for the data, a comma-separated values, rather than invoking the more complex Semantic MediaWiki framework.The new developments on MediaWiki with the Lua programming language extension and Wikidata will be very interesting to follow.They may get a major impact on how scientific data can be represented with a wiki and how simple computations can be made.
Neuroinformatics databases may serve several purposes, e.g., for information retrieval of relevant studies, for meta-analyses and as data, models and code repositories.It seems that most neuroinformatics database federation efforts have primary targeted information retrieval and repository purposes.Increased focus on meta-analysis in database federation would possibly lead to tighter integration between neuroinformatics databases.

Conclusion
The Brede Database and Wiki strive to provide a framework and content for simple Open Science neuroinformatics while connecting to other neuroinformatics databases.
The Brede systems distribute their entire content online under share-alike license.In contrast to Neuroscience Information Framework, Brede does not yet have a federated database query and aggregation.With adoption of the Semantic Web framework for Brede a more direct way of querying and federating Brede data should be possible.The announced projects ORCID and Wikidata are general in scope but may in the future become important resources for neuroscience.We will follow the development of these technologies and furthermore continue to explore how well Semantic MediaWiki can be used to represent neuroscientific data.

Acknowledgment
Lundbeck Foundation has funded Finn Årup Nielsen through the CIMBI project.The 3 reviewers and the editor are thanked for suggesting improvements to the manuscript.Daniela Balslev is also thanked for comments to the manuscript.

Figure 1 :
Figure 1: Relations between some of the online databases.A visualization inspired by the Linked Open Data Cloud diagram by Richard Cyganiak and Anja Jentzsch (http://lod-cloud.net).The large neuroinformatics database federation NIF has been left out.An arrow indicates whether a database has identifiers for items in other databases making outbound deep linking possible.The light gray fill color indicates a dedicated neuroinformatics database, while white fill color indicates more general scientific resources.