1. Introduction
The discovery, description, and development of Earth’s mineral wealth have long been central pursuits of the Earth sciences. For much of that history, the discoveries of new mineral resources and novel mineral species have been based as much on chance finds as on empirical guidelines. The old adage, “Gold is where you find it,” has applied to most natural resources, but data-driven discovery is now changing that mantra. In this contribution, we review the nature of large and growing mineralogical data resources and describe some of the analytical and visualization methods that are being applied to understand the diversity and distribution of minerals in space and time.
Recent studies fall under three broad headings. Mineral evolution is the investigation of Earth’s changing near-surface mineralogy over 4.5 billion years of history—studies that reveal the striking co-evolution of the geosphere and biosphere and the increasing diversity and complexity of mineral species driven by the chemical differentiation of Earth [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27]. Mineral ecology, a complementary pursuit, investigates the diversity and spatial distribution of Earth’s minerals, including consideration of the unusual distribution of rare minerals on Earth [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39]. Finally, mineral network analysis provides a powerful means to analyze and visualize the complex distributions of minerals and their properties through space and time [40]. Taken together, these approaches have the potential to change our view of the evolving mineralogy of Earth and other terrestrial worlds.
2. Mineral data resources
Data-driven discovery relies on comprehensive and reliable tabulations of mineral species, their properties, and their distributions in space and time. The official list of mineral species approved by the International Mineralogical Association (IMA) is documented by the IMA database,††† which is maintained at the Department of Geosciences, The University of Arizona [41]. In addition to recording more than 5400 mineral species, the RRUFF data resource compiles data on crystal structures, compositions, Raman spectra, and other physical properties. Mineral evolution studies require data on mineral ages, localities, and context—data that is compiled at the Mineral Evolution Database.‡ More than 185 000 individual locality/age data for minerals are available through this rapidly expanding, open-access resource.
The largest data resource on the global distribution of minerals is mindat.org,††an international, crowd-sourced effort led by Jolyon Ralph and the Hudson Institute of Mineralogy. The mindat.org data source has recorded more than 1.1 million mineral/locality data from approximately 300 000 localities worldwide—data that are essential in the analysis and visualization of mineral diversity and distribution relationships.
The essential resources of the IMA database and the mindat.org data source are amplified by a number of other data compilations, most notably the petrological and geochemical resources under the umbrella of the Interdisciplinary Earth Data Alliance (IEDA‡‡), including EarthChem††† (e.g., Ref. [42]).
An ongoing challenge in developing these critical data resources is the vast amount of “dark data”—that is, information on mineral compositions, localities, and other data that is available only through hard-copy publications, proprietary corporate documents (notably companies in the natural resourcesindustry), or privately held research records. Data-driven discovery cannot reach its full potential until a culture of data sharing is fully embraced by the Earth science community, with the implementation of “FAIR” (i.e., findable, accessible, interoperable, and reusable) data practice [43].
Given the rich and growing open-access mineralogical data resources, opportunities for applying a range of powerful analytical and visualization methods beckon [44], [45]. In this article, we review a few of these methods as they relate to the fields of mineral evolution, mineral ecology, and mineral network analysis.
3. Mineral evolution
Mineral evolution is the study of the changing near-surface mineralogy of Earth and other terrestrial worlds through deep time [5], [19]. Our detailed understanding of Earth’s 4.5-billion-year history of mineralogical change, coupled with a growing understanding of the mineralogy of other solar system bodies [46], [47], reveals that a planet’s mineralogy evolves through a sequence of stages, each the result of new physical, chemical, and (in the case of Earth) biological modes of mineral paragenesis.
The greater than 185 000 individual locality/age for minerals tabulated in the mineral evolution database, though far short of recording all available mineral/age information, is sufficiently extensive to reveal striking patterns in Earth’s evolving mineralogy. Three first-order trends stand out.
The first trend in the temporal distribution of minerals is a marked episodicity that reflects the supercontinent cycle of the past 3 billion years [8], [12]. We find that Earth has preserved pulses of mineralization during five purported episodes of the convergence and assembly of sometime isolated landmasses into single supercontinents (Fig. 1) [39]. The convergence of continents and consequent orogenic events not only induce mineralization; these mineralizing events are also more likely to be preserved in the cores of the resulting mountain ranges. More detailed investigation of these trends reveals additional subtleties, for example in the unique tectonic and geochemical setting of the assembly of Rodinia at ∼1.3 to 0.9 Ga [27].
The second significant temporal trend in Earth’s evolving mineralogy is an observed increase in the average oxidation state of transition metals [20], [48]. Thus, for example, the minerals of manganese display a systematic increase in redox state over the past 500 million years, with other fluctuations occurring earlier in Earth’s history (Fig. 2). Similar trends have been observed for all of the redox-sensitive, first-row transition metals (Fig. 3†††), as well as for uranium [6]and rhenium [20].
The third trend in the evolution of the mineral world is its increasing structural and chemical complexity with the flow of geological time (Fig. 4) [5], [11], [26]. Numerical estimates of complexity using information-based measures have facilitated the analysis of quantitative correlations between chemical and structural complexities of minerals for a total of 4962 datasets on the chemical compositions and 3989 datasets on the crystal structures of minerals [23], [26]. This analysis demonstrates that there is an overall trend of increasing structural complexity with increasing chemical complexity. Moreover, analysis of mean chemical and structural complexities for mineral groups occurring in different geological periods [5], [15] has demonstrated that both are gradually increasing in the course of mineral evolution. By analogy with biological evolution [49], the increasing mineral complexity follows an overall passive trend: More complex minerals form with the passage of geological time, yet the simpler ones are not replaced (see also Ref. [35]). The observed correlations suggest that, at a first approximation, chemical differentiation is a major force driving the increase of complexity of minerals throughout Earth’s history. New levels of complexity and diversification observed in mineral evolution are achieved through local concentrations of particular rare elements and the creation of new geochemical environments.
4. Mineral ecology
Mineral ecology considers the diversity and spatial distribution of minerals, in much the same way as studies of biological ecosystems document distributions of living species. Earth’s minerals are distributed according to a “large number of rare events” (LNRE) frequency spectrum, which is common to both biological ecosystems and the distribution of words in a book [29], [31], [37]. In each instance, a few species or words are extremely common, but most species or words are rare.
Our detailed understanding of distributions of common and rare mineral species is made possible by the mineral/locality data in mindat.org. These data facilitate the calculation of “accumulation curves,” which reveal estimates of the numbers of “missing” minerals—those types that occur on Earth but have yet to be discovered and described [28], [32]. For example, in a detailed study of the more than 400 carbon-bearing minerals, Hazen et al. [33] predicted that an additional ∼145 carbon-bearing minerals await discovery (Fig. 5) [33]. In addition, they listed several hundred candidates for these missing minerals, noting that most would be hydrous carbonates, with a special emphasis on calcium- and sodium-bearing phases that may have been overlooked because they are relatively nondescript—typically white or grey in color and poorly crystallized [32]. This work inspired the Carbon Mineral Challenge,† an international project supported by the Deep Carbon Observatory‡ to find as many of the missing carbon-bearing minerals as possible. As of 20 May 2019, at least 30 new carbon-bearing species had been discovered, described, and approved by the IMA.
5. Mineral co-occurrence and network analysis
One of the most important challenges of mineralogy is to understand the diversity and distribution of minerals in the context of coexisting assemblages of minerals—a problem that requires considering hundreds of species simultaneously. The large and growing mindat.org data resource, coupled with a variety of analytical and visualization methods, is revolutionizing our ability to document these complex multidimensional systems.
5.1. Chord diagrams
The first step in any analysis of mineral coexistence is to construct a data object with each mineral species as a separate field. In the simple case of a pairwise mineral co-occurrence matrix, each matrix element represents the number of times that two minerals occur together. These data can be represented by a variety of techniques. Chord diagrams array a group of related mineral species as arcs of a circle, with curved lines connecting coexisting species (Fig. 6). Widely employed in gene analysis, such chord diagrams can also prove useful in mineralogy by illustrating numerous pairwise occurrences in a single visual representation. Chord diagrams can be explored in interactive displays, with embedded metadata on numbers of occurrences, as well as details on localities and other coexisting species.
5.2. Klee diagrams
Klee diagrams (sometimes referred to as “heat maps”; Fig. 7) also represent the frequency with which pairs of objects—such as minerals or their essential chemical elements—coexist, and thus are a complementary visualization tool to the chord diagram shown in Fig. 6. This method facilitates rapid analysis of coexisting pairs of minerals or elements; however, it is often desirable to understand the associations of more than two objects at a time. Accordingly, Ma et al. [50] have explored the use of interactive three-dimensional Klee diagrams to understand coexisting elements in minerals (Fig. 8). In spite of their potential for quickly revealing occurrence trends among thousands of mineral pairs, Klee diagrams have not yet been widely applied to mineral coexistence relationships.
5.3. Network analysis
Network analysis is an especially useful tool for exploring complex interrelationships among numerous mineral species [40]. The use of network graphs to elucidate connections in the contexts of social groups [51], [52], [53], [54], technological networks [55], [56], [57], [58], and biological systems [59], [60], [61], [62] are well known. Each network consists of vertices (or nodes), some of which are connected to each other by edges (or links). Distances between nodes, and hence the length of links, are determined by the degree of association of the two nodes; shortest distances represent the strongest links. Vertices and edges can be sized, shaped, and colored to indicate additional attributes of the system.
Networks of coexisting minerals provide vivid examples of network graphs.† In Fig. 9 [40], individual nodes represent mineral species. The nodes are sized to represent the relative number of localities of each species, while node colors can represent compositional, structural, paragenetic, or other information. These highly interactive visual displays represent projections from multidimensional space into two- or three-dimensional space, in order to show the connections from each mineral node to all other co-occurring mineral nodes. In general, for a well-connected network of N different mineral species, the rendering is a projection from N – 1 dimensions. In many instances, a three-dimensional rendering provides important additional information, even though the projection may be from much higher dimensions.