Benchmarking of graphene-based materials: real commercial products versus ideal graphene

There are tens of industrial producers claiming to sell graphene and related materials (GRM), mostly as solid powders. Recently the quality of commercial GRM has been questioned, and procedures for GRM quality control were suggested using Raman Spectroscopy or Atomic Force Microscopy. Such techniques require dissolving the sample in solvents, possibly introducing artefacts. A more pragmatic approach is needed, based on fast measurements and not requiring any assumption on GRM solubility. To this aim, we report here an overview of the properties of commercial GRM produced by selected companies in Europe, USA and Asia. We benchmark: (A) size, (B) exfoliation grade and (C) oxidation grade of each GRM versus the ones of ‘ideal’ graphene and, most importantly, versus what reported by the producer. In contrast to previous works, we report explicitly the names of the GRM producers and we do not re-dissolve the GRM in solvents, but only use techniques compatible with industrial powder metrology. A general common trend is observed: products having low defectivity (%sp2 bonds  >95%) feature low surface area (<200 m2 g−1), while highly exfoliated GRM show a lower sp2 content, demonstrating that it is still challenging to exfoliate GRM at industrial level without adding defects.


Introduction
The seminal paper of Geim and Novoselov, published in 2004 [1] on the properties of graphene, triggered a rapid expansion of research on 2D materials, quickly generating a new technology, and the first attempts towards industrial applications. The Nobel prize was awarded to Geim and Novoselov in 2010, only 6 years after this original paper, and the first commercial products based on graphene composites reached the market soon after, in 2012 [2].
The most common, commercially-available application of graphene and related 2D materials (GRM) is in composites; the first applications of high-quality graphene in electronics, for example as transistors for bio-sensing devices, have also recently been commercialized [3]. Characterization of commercial products such as tennis rackets [4] or infra-red heaters [5] con-firmed that such products truly contains GRM, even if the quality of such GRM is not as high as the one of GRM typically used in research.
Since 2008 graphene has obtained its CAS number (1034343-98-0), a unique numerical identifier that unequivocally identifies a chemical substance [6]. Graphene is still, however, far from becoming a widespread technology. The main problem that renders several industries cautious in adopting this new material is the lack of a clear characterization and metrology of graphene.
The great hype existing around graphene has caused a proliferation of graphene producers, with several hundreds of 'graphene-based' products available on the market [7] with no common agreement on the most important parameters needed to compare different materials. Whilst a nomenclature [8] and classifi-2 A Kovtun et al cation framework (figure 1(a)) [9] have been proposed for 2D graphene-based materials, a clear agreement on standards is still missing.
The size of GRM flakes, as an example, is commonly reported in different formats: 1. the average ± standard deviation, 2. the minimum/maximum size range, 3. the fraction of flakes below/beyond a certain size, 4. the use of percentiles.
The range of the definitions traditionally comes from the use of nanosheet metrology in different applications and fields, both academic and industrial.
It is relatively straightforward to characterize GRM at an academic level: flake size can be observed by transmission electron microscopy (TEM) or atomic force microscopy (AFM), while Raman spectroscopy can measure the number of defects at the atomic scale, the number of mono-, bi-and few-layers present in the sample, and even the presence of doping [10,11]. These techniques work very well for high quality, 'research grade', monolayer graphene, but industrial GRM are, due to production costs, very different. They are challenging samples, composed of highly poly-dispersed flakes, with lateral sizes ranging from few nm to tens of μm, thicknesses spanning from one to tens of layers, and a surface chemistry going from perfect sp 2 networks to highly defective oxidized structures. Several attempts have been made to characterise poly-dispersed GRM using techniques typical of academia [12][13][14]. In 2014 and 2017 we used automatic image processing of optical microscopy, SEM and AFM images to analyse the size distribution and shape of thousands of sheets of boron nitride [12] and graphene oxide [13], using this data to explain the physical mechanism of their fragmentation. In 2018 Castro-Neto and co-workers [14] used a similar approach, based on optical microscopy, to measure the size distribution of many commercial GRM. Results were reported in a statistical form, without naming the specific GRM studied.
Another publication underlined how industrial users look at single layer graphene, multi-layer graphene, and the graphene oxide derivatives as different materials which could be used for different applications; for many of such applications the lateral dimension and surface chemistry are more important than the number of layers [15].
While there are still different point of views on GRM classification, it is commonly agreed that a main issue for GRM is their poor solubility and processability, which requires extensive sonication and/or chemical functionalization [16].
Though, the analysis mentioned above required indeed to dissolve the flakes in a solvent and spin coat them in optimal conditions, assuming a certain solubility of the GRM in the solvent [12][13][14]. As an exam-ple, in [14] the dissolution of the GRM was achieved by sonication for ca. 1 h; this surely helps to disperse the GRM, but will likely cause fragmentation [13], thus reduction in size of the larger flakes. We demonstrated in previous work that even 30 min sonication can reduce the area of monoatomic nanosheets from ≈10 8 nm 2 to ≈10 5 nm 2 [13]. Then, the casting the exfoliated dispersions on a substrate could change (again) the size distribution of the GRM, due to aggregation during solvent evaporation.
Beside the possible artefacts due to fragmentation and aggregation, the approaches described above were time consuming, and could be hardly performed routinely by industrial end-users. A realistic evaluation of the state-of-the-art of existing commercialized products requires a more pragmatic approach, based on fast measurements and not requiring any assumption on GRM solubility. With this aim, we have performed a systematic study of the properties of a range of selected GRMs, trying to define a standard and widely applicable procedure to compare them.
We examined techniques used to characterize GRM flakes and proved their suitability in terms of utility and speed, to analyse highly-defective industrial GRMs featuring flakes of different sizes, thicknesses and defectivity. We used these techniques to characterize and compare a range of GRM flakes coming from different companies located in Europe, Asia and USA. A complete list of the materials tested is summarised in table 1.
Using the results of this benchmarking analysis, we could draw some general conclusions from the range of properties measured on these test materials, and on how they correlate with each other.
We used as a starting point the classification framework previously proposed by Wick et al [9] which identifies three key properties of GRM (figure 1(a)): the flake thickness (i.e. the average number of graphene layers in a single flake), the lateral size of the flakes and the chemical purity of the flakes (i.e. surface chemistry, number of defects and oxidation level). We include in the comparison also the bulk density of the material, which is important for industrial applications, as detailed further on.
It should be underlined that the benchmarking activity we describe herein is not comprehensive for all commercial GRMs, but rather a proof-of-principle exercise. We performed extensive statistical measurements to characterize 12 out of the >1000 products that are claimed to be available commercially worldwide [7].
Also, we did not characterize other important types of GRM materials such as inks and films grown by chemical vapour deposition (CVD), which are most useful for electronics applications. Whilst standardisation work is also ongoing on these materials, here we focus on GRM flakes, commercialized in powder form, mainly for bulk applications.
The objective of this work is not to give a complete analysis of the commercial products tested, but to be the first step towards establishing commonly-accepted procedures and guidelines for benchmarking. These procedures need to be based upon selected techniques, capable of providing information on GRM products with good statistics within a reasonable timeframe.
We hope that the measurements reported here can stimulate a fruitful debate between academic and industrial groups to assess the quality of commercially available GRM, making more objective data available for evaluating the possible applications claimed for these unique materials.

Selection of GRM samples for benchmarking
The industrial sector of graphene production is in rapid evolution, with the number of graphene producers increasing continuously. Thus, instead of randomly analysing any product available on the market, we chose to perform a more in-depth analysis  of a group of twelve GRM products (table 1), selected following practical utility criteria. Many producers sell graphene mostly for R&D use, on the scale of a few grams, at prices which are acceptable for research but not for industrial use and applications. In this survey, we focused instead on the companies able to deliver GRM on the kilogram scale, which are the most interesting for industrial applications. We also focused on products that were more readily available, due to geographical proximity of the producers or existing ongoing collaborations [17].
We used only GRMs that can be purchased in our target market (Europe) and are thus, in the case of large-scale applications, more likely to be adopted by EU industries. It is well known that the majority of current GRM production, both in terms of number of producers and tons produced, is in Asia; however, we found challenging to import test samples of all Asian producers to Europe, either due to customs barriers or company policies. Most Asian producers are focussed mainly on their national market and not all of them ship GRM abroad, as highlighted in a recent survey [18].
Even using these criteria, the number of possible products to compare remained very large (a single producer can sell tens of different grades of GRM). We describe the main properties considered for the selected GRM, and how they were measured below.

Exfoliation grade
The term graphene shall be used, strictly, only for a one-atom-thick sheet of hexagonally arranged, sp 2bonded carbon atoms that is not an integral part of a carbon material, but is freely suspended or adhered to a foreign substrate [8]. Stacking together two graphene sheets to form a bi-layer (with AB stacking, the most common one, or AA stacking), the electronic properties are no longer those of graphene. For thicknesses above ten layers the electronic properties are similar to those of bulk graphite [19], while mechanical properties are known to vary with the number of layers, N. For example, the bending stiffness of an elastically-isotropic plate increases as a function of its thickness cubed. It has been suggested that this will also be the case for multi-layer graphene [14]. Because of the differences in the nature of the bonding within and between the layers, multi-layer graphene is not elastically anisotropic and so the bending stiffness is not necessarily proportional to N 3 . Nevertheless, the bending stiffness is found to increase significantly as the number of layers increases and, although direct measurements of the stiffness are difficult to undertake, it appears that the bending stiffness increases at least as N 2 [20].
Even at the research level, no production technique allows 100% graphene monolayers in solution to be obtained: exfoliated graphene samples will always be composed of a mixture of mono-bi and multi-layers (conversely, graphene oxide monolayers may be obtained quantitatively in solution due to their solubility in water; they have also been studied as an 'ideal' 2D material in previous work) [13,21].
Due to the complexity of graphene-based materials, there is no unique way to define the actual amount of graphene monolayers present in a GRM sample. The yield of graphene produced is reported as: 1. Monolayer yield = number of monolayers/total number of graphitic flakes in solution. 2. Monolayer weight yield = total weight of monolayers/total weight of graphitic flakes in solution. 3. Exfoliation yield = weight of all graphitic material in solution/weight of starting graphite flakes.
For a more detailed discussion on exfoliation yield see the review in [22]. None of these approaches is better than the others, and their utility depends on what is important to obtain for a given application: the number of monolayers, their number plus their size, or just the total number of flakes, mono-or multi-layers, which shall be processed in solution.
Estimating the number of monolayers is a timeconsuming task. The main technique used to discriminate mono-from multi-layers is Raman spectroscopy [23]; the number of graphene layers in a flake modifies the electron bands thus changing the shape, width, and position of the G peak in the Raman spectrum (up to ~5 layers). Other techniques that may be used to identify true single layers and count precisely the number of layers are atomic force microscopy (AFM) [21,24], and transmission electron microscopy (TEM), but these techniques are typically highly localized and are not suited to analyze a statistically significant amount of data in a short time frame.
The specific surface area (SSA) is a key parameter, usually expressed in m 2 g −1 , to understand the morph ology of a powder. The numerical quantification of the area of high surface powders is performed by measuring the amount of a physisorbed gas, typically nitrogen, under controlled temperature and pressure conditions. The fundamental theory used in most commercial and scientific instruments is the BET model (Brunauer-Emmett-Teller), and, for carbon based powders, the normative ASTM D6556-10 [25]. Using this technique it is possible to measure the SSA from thousands to few square meters/gram (m 2 g −1 ).
The measurement of surface area has already been evaluated by the European Commission (EC) as a rapid way to discriminate nano-materials from conventional ones, combining SSA with measurement of the skeletal (i.e. absolute) density of the material [26]. Measurements of the SSA are sensitive to the measurement method used and on the chemical properties of the materials, and can be misleading in the case of, for example, microporous or sintered materials, as demonstrated by measurements performed on TiO 2 , organic pigments or zeolites [27]. The test conditions of BET tests (e.g. processing in a vacuum) can also cause partial aggregation of the GRM, thus giving a SSA lower than the real one. However, results obtained by two leading FP7 projects, NANODEFINE and NANOREG, indicate that this technique could be used to allow faster and easier implementation of the EC nano-materials definition [27], an important and urgent topic also related to safety rules for new mat erials. Here, we use it for a more specific goal, i.e. to compare with each other GRM that share the same layered structure and the same sp 2 carbon-based backbone.
By measuring the SSA of GRM, it is possible to estimate the number of layers n composing each flake with the formula n = 2/ (ρdS), where ρ is the density of graphite (2.267 g cm −3 ), d is the spacing between stacked graphene sheets (0.34 nm) and S is the surface of the specific GRM.
This rough calculation allows to estimate the exfoliation grade of GRM: materials totally exfoliated would have SSA similar to ideal graphene (close to 2600 m 2 g −1 ), while GRMs poorly exfoliated would have SSA similar to graphite powder (≈0.1 m 2 g −1 ). As example, an average flake thickness of 10 monolayers would give a surface area of ≈260 m 2 g −1 .
We should underline that SSA value does not give the actual number of monolayers or the thickness distribution, but may be used to rapidly give an estimate of how much the GRMs are exfoliated, i.e. if they are more similar to 'ideal', perfectly exfoliated graphene or to graphite powder. Measurements on SSA can be performed also in solution, measuring the adsorption of organic dyes on graphene; at research level, it is possible to correlate the macroscopic SSA with nanoscale measurements, obtained with scanning tunnelling microscopy or molecular dynamics [28]. These methods are, again, only useful at research level thus, focusing on industrial GRM, we used here standard SSA measurements based on gas adsorption. We measured the SSA of the selected GRMs by following the ASTM D6556-10 standard method. All the samples were degassed at 300 °C for 3 h. The instrument used was the ASAP 2020 (Micromeritics, USA). Two samples were prepared and measured for each powder, taking the arithmetic mean of the values measured as the main result, and the semi-dispersion as the error. Table 1 shows, in column 2, the SSA measured. We observed a broad range of SSA values for GRM, from few m 2 g −1 to >1000 m 2 g −1 , close to the theor etical one of graphene. As a reference, we compared the industrial GRM also versus a highly exfoliated, thermally-reduced graphene oxide (TRGO) produced by the University of Freiburg. We selected this material as a benchmark comparison because its production is in halfway between the lab scale and the pilot plant scale [29], and because there is a large amount of data published on its characterization and applications [30].
We evaluated how much our measurements agree with those reported by the producers. In table 1, 'reported' values in column 3 are the values reported by the producers on technical datasheets or web pages, while measured values are those measured experimentally by us. Figure 2 shows a graphical comparison of the SSA measured with that reported by the producers. The measured and reported data have the same order of magnitude for GRM with SSA >10 m 2 g −1 , with differences <30%. On the other hand, the area of the material showing the lowest SSA shows larger differences, but this is not unusual in measurements of materials with low SSA. Overall, our data suggest that the SSA reported by a wide range of GRM producers can be considered as reliable value to estimate the exfoliation grade of the product. The measured SSA was then correlated to other materials properties, as detailed below.

Lateral size average and distribution
The lateral size of 2D nanosheets is a fundamental parameter to be evaluated because it has an impact on the performance of the final materials, influencing the mechanical and electrical properties in polymer composites [31], charge transport [21], gas permeation [32] and even biological activity in cells [33].
Most published articles and technical datasheets available report only the average lateral size of GRM, quantified by two common statistical parameters: arithmetic mean (x) and standard deviation (σ), often assuming that the length of the nanosheets follows a Gaussian distribution. However, all published experimental data show that for any given 2D material [12,34] the particle size distribution (PSD) is non-Gaussian, is highly asymmetric and can show complex shapes. In particular, the standard deviation does not give direct information on the breadth of a distribution.
In general, the PSD is defined in terms of a probability density function, as follows: where N tot is the total number of particles. Thus, p (x 0 ) corresponds to the probability of finding particles with the given size x 0 and likewise, PSD(x 0 ) counts the number of particles with the corresponding size.
The PSD of exfoliated GRM can be modelled roughly using a log-normal distribution, as recently observed experimentally for graphene [35], graphene oxide [36] and boron nitride [12].
We have previously analysed the size distribution of monoatomic, perfectly 2D nanosheets of graphene oxide using statistical studies, performed by us using image recognition software on AFM, scanning electron microscopy (SEM) and fluorescent microscopy. This procedure allowed us to measure precisely not only the fraction of sheets having a given lateral size, but also their aspect ratio and form factor, i.e. how much their shape differs from a circle. Such analysis revealed that the log-normal model is just a rough approximation of a more complex size distribution, involving two different populations of large and small sheets [13].
However, this procedure requires a significant amount of data, as well as high-quality images of the material, with single sheets deposited flat on a substrate, with minimal overlap between sheets. Industrial GRM (figure 1) are instead often composed of thick, irregular or crumpled platelets, with a strong tendency to aggregate.
To encourage industrial stakeholders to adopt GRM in large scale applications, methods compatible with industrial standards are needed, capable of analysing GRMs on a large scale with high statistical significance, high speed and low cost. Fortunately, there are already several techniques commercially available to measure the size distribution of more conventional nano-or micro-powders with a classical, 3D shape. Techniques such as dynamic light scattering [12,37,38], analytical ultracentrifugation [36,39] or even acoustic spectroscopy [40] have been already used at the lab scale to measure the lateral size of GRMs. This type of characterization is extensively used as a routine standard characterization in many industrial sectors (e.g. food and pharmaceuticals).
A possible problem with this approach is that most particle analysis techniques interpret data assuming that the particles have a 3D, isotropic shape. Thus, they cannot be used strictly to measure the size of 2D platelets such as graphene. In particular, the interpretation of light scattering measurements is typically based on the Stokes-Einstein equation, which assumes the particles to be hard spheres [41]; thus, it would not be suitable for anisotropic, flat 2D nanoparticles.
However, several works correlating DLS and TEM measurements suggest that standard DLS theory may be used for 2D platelets, by applying a simple scalar correction coefficient [12,38,42]. Furthermore, the GRM we examined cannot be approximated as 2D nanosheets, unlike perfect graphene. SEM images (figures 1(b)-(f)) show that many of the commercial materials examined are crumpled irregular aggregates of 2D nanosheets whose mesoscopic shape is 3D, more similar to that of conventional powders. Furthermore, even a perfect, ideal 2D nanosheet will not be flat in solution or in a composite matrix, but will crumple and fold assuming a 3D shape due to entropic or surface chemistry effects [43]. In practice, when measuring LS of micron-sized particles, a cumulative signal is registered coming simultaneously from an enormous number of randomly oriented particles.
We therefore selected, among the many techniques available, to use a static light scattering (LS) analysis technique as a fast throughput and reliable way to measure the size distribution of the target GRM. Should be noted that LS is already used by some industrial GRM producers to define the lateral size of their GRMs [44]. The dispersion procedure did not require the GRM to be soluble in the selected solvent, and was carefully tuned to avoid aggregation of particles on one side, but also to avoid further exfoliation or fragmentation of the particles, as example by prolonged sonication (see SI).
This technique uses a charge-coupled device (CCD) camera to register, with high resolution, the light of a laser scattered by a dispersion of nanoparticles in solution. It measures the dependence of the average scattered intensity on the scattering angle and is sensitive to spatial variations in the dielectric constant. Such a technique has several advantages over other techniques previously used [14]: (1) it is fast, allowing a measurement to be performed in a few seconds; (2) it works for particles in liquids, and does not require the particles to be deposited on a substrate, thus avoiding artefacts due to additional aggregation; (3) it can measure particles with a wide size range, from 1 μm to 2.5 mm equivalent spherical diameter (in contrast, dynamic light scattering can only measure small particles with size comparable to the light wavelength, ideally a few hundred nm); (4) it does not only provide an average size but also gives the size distribution of the particles' population. It is thus also highly suited for samples that do not follow a Gaussian size distribution, and for samples composed of mixtures of particles of different nature; (5) it is widely used at the industrial level (e.g. in the food and pharmaceutical industries) and defined by an industrial standard (ISO13320).
We should underline that LS does not properly provide the PSD, as in the case of the microscopies previously described, due to its non-linear sampling. Moreover, while PSD is usually defined in terms of number distribution (i.e. each particle has equal weighting once the final distribution is calculated), LS measurements provide a volume distribution in which each particle volume has equal weighting. This is analytically defined as the incremental volume percent distribution (IVPD) and is derived from the cumulative volume distribution sampled in log-scale. Because of the irregularity of the shape of the particles, IVPD corresponds to the measurement or the effective diameter of an equivalent spheroid. It is very useful to describe particles with sizes spanning several orders of magnitude.
LS was thus used to give an estimate of the relative abundance in volume of particles with a given diameter (D) in a mixture; a typical size distribution measured by LS is shown in figure 3; the IVPD of all measured samples are available in the SI. They show that all GRM have very different and irregular size distributions, in many cases suggesting a combination of different populations of flakes, similarly to what we observed in more detail for GO nanosheets [13]. Such size distribution cannot be described by a scalar number, just providing the average size, but requires more refined analysis, as detailed in the following section.

Classification of size distribution using percentiles
While the graphical representation of the IVPD provides a complete description of the flakes' abundance, it would be better for practical use to define a series of statistical parameters to compare the size of GRM. Given that the starting data measured are cumulative distributions, it is useful here to use the percentiles (Dx). The most useful and intuitive percentile is D50: in a poly-dispersed sample, 50% v/v of the particles will have a lateral size smaller than D50 and the other half will have a lateral size larger than D50.
D50 shall be defined as the median of the sample size distribution, dividing the distribution in two parts of equal volume. The D50 gives an average size of the particles, but gives no indication of the breadth of the size distribution, and of the polydispersity of the sample.
For this reason, two additional percentiles are commonly defined to determine if a sample contains very small or very large particles. D10 defines the smallest 10% fraction of the sample, while D90 will include the smaller 90% fraction of the sample, excluding the largest 10% particles fraction. The size range between D10 and D90 therefore includes the most representative 80% fraction of the sample, excluding the largest and smallest flakes.
Each GRM producer therefore generally aims at maximizing one of these parameters in particular depending on the target application; for example, gas barrier applications require a significant number of large sheets, thus maximizing D90 [45,46], applications in energy storage would require smaller sheets with a large number of sheet edges to favour intercalation [22], with some large sheets needed to favour charge transport, thus having a wide D10-D90 range. Biological applications may instead prefer GRM with a narrow D10-D90 size distribution, to better correlate a given size with the biological effects on cellular functionality [33]. Figure 3 shows an example of the size distribution measured for a GRM powder (XGNP-M15). The abscissa axis reports the lateral size, while the ordinate gives the incremental percentage of volume occupied by the particles. The IVPD shows a peak between 20 and 30 μm, with a long tail of smaller particles below 1 μm. It is evident from figure 3 that the IVPD cannot be approximated as a Gaussian distribution, so reporting it as a simple average plus or minus a standard deviation is not correct.
Similar reasoning can be performed on all samples analysed; the PSD of all of the samples observed are reported in SI, and showed a complex and multimodal distribution. This is due to the different synthesis, purification and processing steps performed for large scale production, which are confidential and undisclosed by the producers.
The D50 calculated from the experimental data are reported in table 1 and figure 4. They show a broad range of sizes, with GRM products going from 2 μm to more than 100 μm. This is expected, given that the different commercial GRM are meant for different target applications. The range of sizes measured overlaps well with what is typically reported in academic articles, spanning from 0.5 to 2 μm (typical of pristine graphene exfoliated in solvents [47]) to >100 μm (typical of water-soluble graphene oxide single sheets [48]).
As before, we compared our measurements with what was reported in the technical data sheets from the GRM producers (where available).
It was not possible to plot the producers' data in figure 3 because they were given in different formats (e.g. as size range, largest or smallest size, average) so they were reported as a table next to the figure. The method used to calculate the flake size reported is often not described by the producer; single values should be considered an arithmetic average, so may be compared to the D50 value measured experimentally, while size ranges may be considered as including at least 90% of the particles composing the sample.
As for the exfoliation grade, the measured lateral size also agrees roughly with the values reported by the producers. Though, it is evident that the reported value in many cases does not provide information as complete as the D10-50-90 percentiles.
For example, a size >50 μm is given for sample AVA-FLG18, this corresponds well with the measured D50, but does not give information on the real size range (80% of the particles, by volume, are between 10 and 100 μm).
For sample AVA-FLG23, a smaller sheet size is reported (20 μm), although the real sample shows small flakes of ≈30 μm together with larger flakes (up to 240 μm).
Overall, it seems clear that the information available on the size of commercial GRM is neither standardized nor complete, and a more coherent method to report it is needed. Different approaches need to be used for this, including a full graphical representation of the size distribution of the sample (as available in SI). In our opinion, the use of the D10-50-90 values described above is a complete, fast and reliable method to describe the size distribution of a real sample.
It is often assumed implicitly, in the field of graphene production, that higher exfoliation requires more energetic treatments (e.g. longer sonication time), thus leading to a fragmentation and to a smaller size of the flakes. While we recently demonstrated and explained this assumption for sonicated, fully exfoliated GO [13], we did not find such evidence in the case of commercial GRM.  TRGO reference sample, which is the most exfoliated one (see above) gave a D50 of 24 μm, practically comparable to that of most of GRM samples. This is in stark contrast with typical GRM produced in the lab, where the exfoliation grade and sheet size are correlated well, with graphene oxide monolayers having larger sheets than non-oxidised graphene obtained by sonication in solvents [ 16 ] .
We can see that all commercial GRM samples are quite different from 'ideal' graphene (represented as a flake having SSA = 2600 m 2 g −1 , red vertical line in figure 5). We should underline, however, that such properties are unrealistic for bulk GRM materials, which should have a cost compatible with large scale applications, ideally <100 $ kg −1 . For comparison, high-quality monolayer graphene produced by CVD has a price of ca. 450 dollars for a 4-inch wafer [ 49 ] , which would correspond, if translated in dollars/gram, to a cost six orders of magnitude higher than that of bulk GRM.
The goal of a useful benchmarking method for GRM is thus not to demonstrate that they are composed of 'ideal' graphene, something already well known in the academic community [ 14 ] . The real goal of our approach is to define in a coherent and comparable way their properties, allowing an industrial enduser to select the best GRM for their target application.

Oxidation grade
The ideal graphene is entirely composed of carbon atoms, bound together through sp 2 bonds to form a perfect honeycomb lattice. Graphene obtained by mechanical exfoliation can show the features of ideal graphene, but graphene obtained by industrial techniques such as exfoliation or CVD always present a wide range of defects. These can range from simple deformation of the honeycomb structure (lattice vacancies, sheet edges, Stone-Wales defects etc) to the presence of atoms other than carbon, covalently bound to or even embedded into the graphene lattice. The most common heteroatom found in GRM is oxygen, which in graphene oxide sheets represents a significant fraction of the total atoms.
Raman Spectroscopy is the main technique used to quantify defects in graphene. In particular, the D peak in the Raman spectrum of graphene is not present in perfect graphene, but increases in intensity as the number of defects increases.
The D peak is sensitive to anything disrupting the symmetry of the graphene honeycomb lattice, such as grain boundaries, vacancies, edges, C atoms with sp 3 hybridization etc but does not allow the type of defect causing the disruption to be deduced.
The ratio between the intensities of the D and G peaks I(D)/I(G) is often used to estimate the levels of defects in graphene, and increases when the average distance between defects (L a ) decreases down to about 2 nm [ 10 ] . For highly defective materials, however, with L a < 2 nm this proportionality breaks down; highlydefective materials can show a value of I(D)/I(G) smaller than that of graphene a low level of defects. Furthermore, for flakes smaller in diameter than the laser spot give rise to a significant D peak from their edge sites, which may be mistaken for sp 3 -like defects in the basal plane. More information upon the nature of the defects in graphene can be obtained from the D′ band that is often found as a high wavenumber shoulder of the G band [ 50 ] . In particular it is found that I(D)/I(D′) is a maximum (~13) for sp 3 defects, decreases for vacancy-like defects and reaches a minimum (~3.5) for boundaries.
Raman spectroscopy is (and will likely remain) the best technique to characterize high-quality graphene. However, it is important to use additional techniques capable of determining not only the number of defects, but also their chemical nature. This is important at the research level, but it is even more important for industrial products, because we do not know how they have been produced, to which chemi- cals they have been exposed, and thus which kind of contaminants are present. To this aim, we used x-ray photoelectron spectroscopy (XPS) to perform a systematic characterization of all the GRM studied.
In XPS the photoemission signal depends strongly on the type of emitting atoms. Different atomic species have different binding energies of their core atoms. Despite core levels not being directly involved in bonding between atoms, they are strongly affected by changes in the valence states through molecular bonding: this effect is the so-called chemical shift. The ability to measure, in a quantitative way, the presence of different atomic species, and their chemical bonds, combined with an extreme surface sensitivity, makes XPS an ideal tool for the chemical investigation of surfaces and thin films. This is why XPS is also known with the alternative acronym of ESCA (electron spectroscopy for chemical analysis).
When characterizing GRM, XPS can first be used to detect the presence of different heteroatoms (nitrogen, oxygen, metals etc). Then, a high-resolution spectrum centred on the C peak can be used to estimate the abundance of different specific bonds: aromatic sp 2 , sp 3 defects, hydroxyl C-OH, epoxy C-O-C, carbonyl (C=O) and carboxyl (O-C=O), etc [ 51 ] . In this way, a single technique can determine both the chemical impurities (presence of heteroatoms) and structural defects (disruption of the honeycomb lattice) in graphene nanosheets. The accuracy of determining these quantities is still challenging in XPS analysis; however, by using a new protocol to deconvolute the carbon C 1s peak we could calculate with high precision the amount of oxygen and the relative concentrations of structural defects (i.e. specific bonds such as sp 2 , hydroxyl, epoxy, etc) [ 52 ] .
We considered also alternative techniques for the chemical characterization of GRM such as Thermal gravimetric analysis (TGA) or infra-red spectroscopy, which are though more qualitative than XPS. IR analysis can identify the different carbon chemical groups such as C-O and C-C, but does not provide information on the presence of other heteroatoms as contaminants in the GRM; furthermore, the oxygen content is estimated from stoichiometric considerations about the different C-O bands and not from a separate peak as the O 1s in XPS. Elemental analysis could instead give a quantitative estimation of the presence of different elements, but gives no information on the chemical state of such elements, (epoxy, sp 2 aromatic, etc). TGA detects the presence of different chemical moieties in the sample using their different thermal stability, but it is often difficult to deconvolute the effect of different groups on the gravimetric curve. XPS provides at the same time quantitative estimation of the presence of different elements and on their different chemical state, and is thus an ideal technique for GRM analysis.
XPS has some intrinsic limits that must be taken into account: (i) surface sensitivity: XPS is extremely sensitive to the surface oxidation and defects (sp 2 fraction) up to a depth of 3-10 nm, which is the optimal probing depth for highly exfoliated materials (few nm thick flakes), but less effective for the thicker, poorly exfoliated materials. (ii) adsorbed and intercalated water is always present, but can be minimized by long pre-treatment in an Ultra High Vacuum environment (24 h) [ 53 ] .
The details on how the XPS measurements were performed are reported in the SI. In all commercial GRMs, oxygen was the most abundant species after carbon (see SI), but XPS also allowed us to detect the presence of other atoms such as nitrogen and sulphur. Table 1 shows the amount of O atoms present in each sample. It also shows the % of sp 2 bonds present between C atoms, calculated from the fitting of the C 1s spectrum.
It could be expected that the amount of oxygen atoms chemically bound to the GRM should destroy the sp 2 lattice, and thus be directly proportional to the %sp 2 . Figure S1 in SI (stacks.iop.org/TDM/6/025006/ mmedia) shows that such a relationship is roughly present, but with strong variations from sample to sample; O and C atoms can bind together in different ways, creating epoxy, carboxy or hydroxyl groups, thus generating the complex structure typical of graphene oxide [ 54 ] .
Another common assumption in the graphene community is that a high number of defects is needed to foster efficient exfoliation; our data indicate that this assumption is statistically correct. Figure 6 shows the correlation between the defectivity of the graphene lattice (% of sp 2 ) and the exfoliation grade (SSA). It is possible to see that all the samples showing %sp 2 > 95 are grouped in the region with SSA < 200 m 2 g −1 (corresponding to an average flake thickness of ⩾13 monolayers). This means, as expected, that it is relatively easy to prepare GRMs with low percentage of defects and low degree of exfoliation. In order to obtain GRMs with low number of layers (SSA >200 m 2 g −1 ) usually it is necessary to increase the exfoliation of graphite with techniques (i.e. chemical or physical method) that often increase the damage of the aromatic network, as showed in figure 6 for GRM with SSA > 200 m 2 g −1 . The different percentage of defects, in function of SSA of GRMs, depends from many parameters, but certainly the production techniques and the post treatment of the material, i.e. thermal annealing, strongly influence the quality and the properties of the GRM. The statistical trend we observed here only refers to commercial GRM, which should be produced in large quantity and low cost. It is of course possible to produce materials having high sp 2 content and high exfoliation grade on lab scale for research goals, where cost is not a limiting factor.
The presence of sp 3 defects, as compared to sp 2 bonds, seems thus to be the key factor hindering the re-stacking of GRM after production, allowing them to remain exfoliated even in powder form.

Bulk density
A property of GRM important for industrial applications (but often underestimated at research level) is their density. The 'bulk' density of a powder is the ratio of the mass of an untapped powder sample and its volume including the contribution of the inter-particulate void volume. Hence, the bulk density depends on both the density of powder particles and the spatial arrangement of particles in the powder bed.
Due to differences in morphology, the GRMs show different bulk density values and this property strongly influences the processability of GRMs in the preparation of composites.
Nano-materials such as GRM can have a very low density, looking as highly 'fluffy' powders; this gives problems for: (1) processing (i.e. difficult to feed in a standard extruder); (2) transport (require large containers to transport a few kg of material) and (3) health (can easily be dispersed in air and be inhaled by nearby workers).
It is possible to overcome some of these problems working in liquid or premixing the GRM with polymer before using [ 55 ] . On the other hand, GRMs with low bulk density could perform better in some applications, as example as sorbent for water purification applications.
We measured the bulk density of commercial GRM powders, measured with a Scott volumeter (see SI). Figure S3 shows the values of bulk density of the different GRMs analysed. The GRMs studied present a large range of values; as expected, the most exfoliated material (TRGO) showed the lowest density of 6 mg ml −1 (TRGO), but no general correlation between exfoliation and density was observed (figure S3). Some samples (e.g. Graphenit-OX and XGnP C750) also showed a relatively high density together with a good SSA, indicating that it is possible to achieve good exfoliation without increasing too much the volume occupied by the GRM.

Conclusions
The 'ideal' GRM material would be composed of 100% sp 2 carbon in the form of large, mesoscopic flakes with monoatomic thickness, having at the same time high packing density.
Of course, the real GRM are very far from this ideal definition, and each of them represents a compromise among the different properties we would like to have.
To represent all key properties of different GRM in a synthetic way, we used the approach suggested in [ 9 ] , using a 3D representation of data, with X, Y, Z axes corresponding to flake size, oxidation and defectivity respectively (figure S4). However, the representation was not clear enough for our data.
We tested as well an alternative graphical representation, combining the different data in a 4-axis graph (figure 7); in this 2D plot, each X or Y semi-axis corresponds to a GRM property: -Positive X semi-axis: average lateral size, reported as the D50 value, obtained from Laser Scattering measurements; -Negative X semi-axis: the percentage of defects, calculated as % of sp 2 by XPS measurements; -Positive Y semi-axis: specific surface area (SSA) from BET measurements; -Negative Y semi-axis: the average bulk density. Each rectangle is specific to a GRM, and crosses the axes in 4 points corresponding to the specific values of the material properties.
The rectangles give the opportunity to better visualize the properties of each material and allow a rapid comparison between different products. For example, it is possible to see the differences between a material highly exfoliated like TRGO-with high surface area, low bulk density and a high level of defects-and AVA-FLG 23, a material that is less exfoliated (low surface area), but with higher lateral size and bulk density and a much lower number of defects compared to TRGO. A material approaching the 'perfect' graphene mentioned above would be represented as the dashed light blue rectangle in figure 7, with a large overall area and high values along all four semi-axes.
Our results cannot be used to claim that one GRM is 'better' than another in absolute terms, because different GRM are suitable for different applications; as example large flakes are usually needed for mechanical reinforcement [ 45 ] while smaller sheets, with a high number of edges favouring ion intercalation, can be better for charge or energy storage [ 56 ] . The exfoliation grade is the parameter discriminating graphene from graphite nanoplatelets; though, in many cases poorly exfoliated GRM can have a better performance/ cost ratio than higher quality graphene. The goal of this exercise was thus not to select the best commercial material, but to demonstrate that the techniques used can be applied to measure and compare the properties of a wide range of commercial GRM, obtaining meaningful results. Noteworthy, this allowed to observe how the different desirable properties of GRM can coexist with each other, and how different GRM show common limitation.
Our results suggest that it is statistically possible to achieve high exfoliation without disrupting too much the flakes (as previously suggested from the case of GO, which shall be obtained in monolayer forms but with lateral size exceeding 10 μm) [ 13 , 48 ] . Conversely, the exfoliation grade and defectivity seem statistically correlated, even if not following a simple linear correlation.
All products we measured featuring a high-quality, unperturbed graphene-like lattice (%sp 2 > 95%) featured a SSA lower than 200 m 2 g −1 ; conversely, GRM featuring a high SSA had a high defectivity, with %sp 2 < 95%. This general observation applied to GRM produced with different techniques, in different factories from different continents. This indicates that production of high-quality graphene, even if possible at lab scale, still represent a major industrial challenge. However, many industrial applications do not require perfectly exfoliated graphene, while new methods for high-yield exfoliation are being developed continuously at the research level [ 35 , 57 ] . While the actual quality of industrial GRM is clearly good enough for many applications, the definition of such properties need still to be better defined and qualified, using standard methods, as we tried to do in this work.