1. Introduction

The discovery, description, and development of Earth’s mineral wealth have long been central pursuits of the Earth sciences. For much of that history, the discoveries of new mineral resources and novel mineral species have been based as much on chance finds as on empirical guidelines. The old adage, “Gold is where you find it,” has applied to most natural resources, but data-driven discovery is now changing that mantra. In this contribution, we review the nature of large and growing mineralogical data resources and describe some of the analytical and visualization methods that are being applied to understand the diversity and distribution of minerals in space and time.

Recent studies fall under three broad headings. Mineral evolution is the investigation of Earth’s changing near-surface mineralogy over 4.5 billion years of history—studies that reveal the striking co-evolution of the geosphere and biosphere and the increasing diversity and complexity of mineral species driven by the chemical differentiation of Earth [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]Mineral ecology, a complementary pursuit, investigates the diversity and spatial distribution of Earth’s minerals, including consideration of the unusual distribution of rare minerals on Earth [28][29][30][31][32][33][34][35][36][37][38][39]. Finally, mineral network analysis provides a powerful means to analyze and visualize the complex distributions of minerals and their properties through space and time [40]. Taken together, these approaches have the potential to change our view of the evolving mineralogy of Earth and other terrestrial worlds.

2. Mineral data resources

Data-driven discovery relies on comprehensive and reliable tabulations of mineral species, their properties, and their distributions in space and time. The official list of mineral species approved by the International Mineralogical Association (IMA) is documented by the IMA database,††† which is maintained at the Department of Geosciences, The University of Arizona [41]. In addition to recording more than 5400 mineral species, the RRUFF data resource compiles data on crystal structures, compositions, Raman spectra, and other physical properties. Mineral evolution studies require data on mineral ages, localities, and context—data that is compiled at the Mineral Evolution Database. More than 185 000 individual locality/age data for minerals are available through this rapidly expanding, open-access resource.

The largest data resource on the global distribution of minerals is mindat.org,††an international, crowd-sourced effort led by Jolyon Ralph and the Hudson Institute of Mineralogy. The mindat.org data source has recorded more than 1.1 million mineral/locality data from approximately 300 000 localities worldwide—data that are essential in the analysis and visualization of mineral diversity and distribution relationships.

The essential resources of the IMA database and the mindat.org data source are amplified by a number of other data compilations, most notably the petrological and geochemical resources under the umbrella of the Interdisciplinary Earth Data Alliance (IEDA‡‡), including EarthChem††† (e.g., Ref. [42]).

An ongoing challenge in developing these critical data resources is the vast amount of “dark data”—that is, information on mineral compositions, localities, and other data that is available only through hard-copy publications, proprietary corporate documents (notably companies in the natural resourcesindustry), or privately held research records. Data-driven discovery cannot reach its full potential until a culture of data sharing is fully embraced by the Earth science community, with the implementation of “FAIR” (i.e., findable, accessible, interoperable, and reusable) data practice [43].

Given the rich and growing open-access mineralogical data resources, opportunities for applying a range of powerful analytical and visualization methods beckon [44][45]. In this article, we review a few of these methods as they relate to the fields of mineral evolution, mineral ecology, and mineral network analysis.

3. Mineral evolution

Mineral evolution is the study of the changing near-surface mineralogy of Earth and other terrestrial worlds through deep time [5][19]. Our detailed understanding of Earth’s 4.5-billion-year history of mineralogical change, coupled with a growing understanding of the mineralogy of other solar system bodies [46][47], reveals that a planet’s mineralogy evolves through a sequence of stages, each the result of new physical, chemical, and (in the case of Earth) biological modes of mineral paragenesis.

The greater than 185 000 individual locality/age for minerals tabulated in the mineral evolution database, though far short of recording all available mineral/age information, is sufficiently extensive to reveal striking patterns in Earth’s evolving mineralogy. Three first-order trends stand out.

The first trend in the temporal distribution of minerals is a marked episodicity that reflects the supercontinent cycle of the past 3 billion years [8][12]. We find that Earth has preserved pulses of mineralization during five purported episodes of the convergence and assembly of sometime isolated landmasses into single supercontinents (Fig. 1[39]. The convergence of continents and consequent orogenic events not only induce mineralization; these mineralizing events are also more likely to be preserved in the cores of the resulting mountain ranges. More detailed investigation of these trends reveals additional subtleties, for example in the unique tectonic and geochemical setting of the assembly of Rodinia at ∼1.3 to 0.9 Ga [27].

  1. Download : Download high-res image (76KB)
  2. Download : Download full-size image
Fig. 1. First-row transition metal mineral–locality occurrences by max age (minerals listed once with highest oxidation state from any first-row transistion elements in formula). Our record of Earth’s minerals through time typically reveals pulses of mineralization that are associated with the supercontinentcycle. In this graph of approximately 60 000 mineral/age data for minerals incorporating first-row transition metals, pulses of mineralization are associated with the supercontinents Kenorland, Nuna, Rodinia, Pannotia, and Pangea. Note that mineralization associated with Rodinian assembly at ∼1.3–0.9 Ga is less distinct than the peaks with other supercontinents, as a consequence of its unique tectonic setting [39]. 1+–8+ refer to different oxidation states.

The second significant temporal trend in Earth’s evolving mineralogy is an observed increase in the average oxidation state of transition metals [20][48]. Thus, for example, the minerals of manganese display a systematic increase in redox state over the past 500 million years, with other fluctuations occurring earlier in Earth’s history (Fig. 2). Similar trends have been observed for all of the redox-sensitive, first-row transition metals (Fig. 3†††), as well as for uranium [6]and rhenium [20].

  1. Download : Download high-res image (68KB)
  2. Download : Download full-size image
Fig. 2. Changes in Earth’s near-surface oxidation state, the consequence of the evolution of oxygenic photosynthesis, are reflected in the changing ratios of manganese in the II (Mn2+), III (Mn3+), and IV (Mn4+) oxidation states. The average oxidation state of manganese increases, most notably during the past 500 million years. GOE: Great Oxidation Event.
  1. Download : Download high-res image (185KB)
  2. Download : Download full-size image
Fig. 3. Normalized mineral–locality occurrences by max age for different elements. A “skyline diagram” of minerals containing first-row transition elements reveals systematic trends associated with the supercontinent cycle and Earth’s changing atmospheric composition.

The third trend in the evolution of the mineral world is its increasing structural and chemical complexity with the flow of geological time (Fig. 4[5][11][26]. Numerical estimates of complexity using information-based measures have facilitated the analysis of quantitative correlations between chemical and structural complexities of minerals for a total of 4962 datasets on the chemical compositions and 3989 datasets on the crystal structures of minerals [23][26]. This analysis demonstrates that there is an overall trend of increasing structural complexity with increasing chemical complexity. Moreover, analysis of mean chemical and structural complexities for mineral groups occurring in different geological periods [5][15] has demonstrated that both are gradually increasing in the course of mineral evolution. By analogy with biological evolution [49], the increasing mineral complexity follows an overall passive trend: More complex minerals form with the passage of geological time, yet the simpler ones are not replaced (see also Ref. [35]). The observed correlations suggest that, at a first approximation, chemical differentiation is a major force driving the increase of complexity of minerals throughout Earth’s history. New levels of complexity and diversification observed in mineral evolution are achieved through local concentrations of particular rare elements and the creation of new geochemical environments.

  1. Download : Download high-res image (79KB)
  2. Download : Download full-size image
Fig. 4. Mean chemical and structural information-based complexities for minerals occurring in different eras of mineral evolution (1 = 12 “ur-minerals” [5]; 2 = 60 minerals of chondritic meteorites [5]; 3 = 420 minerals of the Hadean epoch [11]; 4 = all minerals of the post-Hadean era) calculated for a total of 4962 datasets on the chemical compositions and 3989 datasets on the crystal structures of minerals [26]. (a) Shannon information per atom (IG); (b) Shannon information per unit cell or formula unit (IG,total).

4. Mineral ecology

Mineral ecology considers the diversity and spatial distribution of minerals, in much the same way as studies of biological ecosystems document distributions of living species. Earth’s minerals are distributed according to a “large number of rare events” (LNRE) frequency spectrum, which is common to both biological ecosystems and the distribution of words in a book [29][31][37]. In each instance, a few species or words are extremely common, but most species or words are rare.

Our detailed understanding of distributions of common and rare mineral species is made possible by the mineral/locality data in mindat.org. These data facilitate the calculation of “accumulation curves,” which reveal estimates of the numbers of “missing” minerals—those types that occur on Earth but have yet to be discovered and described [28][32]. For example, in a detailed study of the more than 400 carbon-bearing minerals, Hazen et al. [33] predicted that an additional ∼145 carbon-bearing minerals await discovery (Fig. 5[33]. In addition, they listed several hundred candidates for these missing minerals, noting that most would be hydrous carbonates, with a special emphasis on calcium- and sodium-bearing phases that may have been overlooked because they are relatively nondescript—typically white or grey in color and poorly crystallized [32]. This work inspired the Carbon Mineral Challenge, an international project supported by the Deep Carbon Observatory to find as many of the missing carbon-bearing minerals as possible. As of 20 May 2019, at least 30 new carbon-bearing species had been discovered, described, and approved by the IMA.

  1. Download : Download high-res image (96KB)
  2. Download : Download full-size image
Fig. 5. (a) The frequency spectrum for carbon-bearing minerals reveals that most minerals are rare. The horizontal axis records the exact number of localities (m) at which a carbon-bearing mineral species is found. The vertical axis indicates how many mineral species occur at exactly that number of localities. Grey bars are the observed values, while blue bars indicate the modeled values. Of the 403 documented carbon-bearing minerals in 2016, more than 100 are known from only one locality, while 40 have been described from exactly two localities. (b) This “large number of rare events” distribution facilitates calculation of an accumulation curve (upper blue curve), shown here on a graph of the number of observed mineral/locality data (NX axis) versus the estimated number of different mineral species (Y axis). Extrapolation of this curve to the right suggests that an additional 145 carbon-bearing minerals await discovery and description [33]. The vertical dashed line indicates the number of mineral/locality data (82 922) and known species (403) as of 2016. Curves 1 and 2 represent the evolving numbers of different mineral species identified from exactly one or two localities, respectively—values that change systematically as more mineral/locality data accumulate. Note that these curves go through a maximum value; the number of minerals known from only one locality is now declining as more mineral/locality data are reported.

5. Mineral co-occurrence and network analysis

One of the most important challenges of mineralogy is to understand the diversity and distribution of minerals in the context of coexisting assemblages of minerals—a problem that requires considering hundreds of species simultaneously. The large and growing mindat.org data resource, coupled with a variety of analytical and visualization methods, is revolutionizing our ability to document these complex multidimensional systems.

5.1. Chord diagrams

The first step in any analysis of mineral coexistence is to construct a data object with each mineral species as a separate field. In the simple case of a pairwise mineral co-occurrence matrix, each matrix element represents the number of times that two minerals occur together. These data can be represented by a variety of techniques. Chord diagrams array a group of related mineral species as arcs of a circle, with curved lines connecting coexisting species (Fig. 6). Widely employed in gene analysis, such chord diagrams can also prove useful in mineralogy by illustrating numerous pairwise occurrences in a single visual representation. Chord diagrams can be explored in interactive displays, with embedded metadata on numbers of occurrences, as well as details on localities and other coexisting species.

  1. Download : Download high-res image (121KB)
  2. Download : Download full-size image
Fig. 6. A chord diagram of the 43 most common cobalt-bearing minerals reveals coexisting pairs of minerals. This rendering reveals that the secondary mineral erythrite (Co3(AsO4)2·8H2O) is the most abundant cobalt mineral, and that it is most commonly associated with the two most common primary cobalt ore mineralscobaltite (CoAsS) and skutterudite (CoAs3−x).

5.2. Klee diagrams

Klee diagrams (sometimes referred to as “heat maps”; Fig. 7) also represent the frequency with which pairs of objects—such as minerals or their essential chemical elements—coexist, and thus are a complementary visualization tool to the chord diagram shown in Fig. 6. This method facilitates rapid analysis of coexisting pairs of minerals or elements; however, it is often desirable to understand the associations of more than two objects at a time. Accordingly, Ma et al. [50] have explored the use of interactive three-dimensional Klee diagrams to understand coexisting elements in minerals (Fig. 8). In spite of their potential for quickly revealing occurrence trends among thousands of mineral pairs, Klee diagrams have not yet been widely applied to mineral coexistence relationships.

  1. Download : Download high-res image (240KB)
  2. Download : Download full-size image
Fig. 7. Klee diagrams (sometimes referred to as “heat maps”) represent the frequency with which pairs of minerals, elements, or other objects coexist. This rendering displays a 72 × 72 matrix of coexisting chemical elements in minerals, in which each matrix element represents the fraction of minerals with element X that also incorporates element Y. This matrix is not symmetrical; for example, all minerals containing beryllium also incorporate oxygen, but only a small fraction of oxygen-bearing minerals incorporate beryllium.
  1. Download : Download high-res image (589KB)
  2. Download : Download full-size image
Fig. 8. A three-dimensional interactive Klee diagram facilitates the exploration of triplets of coexisting minerals or elements. This example from Ref. [50]records the frequency of co-occurrence of triplets of chemical elements in minerals. (a) The cube-shaped rendering is difficult to interpret, but any planar slice of the cube can be viewed independently; (b) alternatively, the cube can be rendered in an “exploded” version to allow users to see the “inside” of the cube. The red line indicates the centerline of the 3D diagram. The arrow points to one of many “hot spots,” in this case Ca + Ca + O, where the combination of elements is more commonly found in minerals than would be predicted based on crustal abundances. REE: rare earth elements.

5.3. Network analysis

Network analysis is an especially useful tool for exploring complex interrelationships among numerous mineral species [40]. The use of network graphs to elucidate connections in the contexts of social groups [51][52][53][54], technological networks [55][56][57][58], and biological systems [59][60][61][62] are well known. Each network consists of vertices (or nodes), some of which are connected to each other by edges (or links). Distances between nodes, and hence the length of links, are determined by the degree of association of the two nodes; shortest distances represent the strongest links. Vertices and edges can be sized, shaped, and colored to indicate additional attributes of the system.

Networks of coexisting minerals provide vivid examples of network graphs. In Fig. 9 [40], individual nodes represent mineral species. The nodes are sized to represent the relative number of localities of each species, while node colors can represent compositional, structural, paragenetic, or other information. These highly interactive visual displays represent projections from multidimensional space into two- or three-dimensional space, in order to show the connections from each mineral node to all other co-occurring mineral nodes. In general, for a well-connected network of N different mineral species, the rendering is a projection from N – 1 dimensions. In many instances, a three-dimensional rendering provides important additional information, even though the projection may be from much higher dimensions.