Classifying the tracing difficulty of 3D neuron image blocks based on deep learning

Quickly and accurately tracing neuronal morphologies in large-scale volumetric microscopy data is a very challenging task. Most automatic algorithms for tracing multi-neuron in a whole brain are designed under the Ultra-Tracer framework, which begins the tracing of a neuron from its soma and traces all signals via a block-by-block strategy. Some neuron image blocks are easy for tracing and their automatic reconstructions are very accurate, and some others are difficult and their automatic reconstructions are inaccurate or incomplete. The former are called low Tracing Difficulty Blocks (low-TDBs), while the latter are called high Tracing Difficulty Blocks (high-TDBs). We design a model named 3D-SSM to classify the tracing difficulty of 3D neuron image blocks, which is based on 3D Residual neural Network (3D-ResNet), Fully Connected Neural Network (FCNN) and Long Short-Term Memory network (LSTM). 3D-SSM contains three modules: Structure Feature Extraction (SFE), Sequence Information Extraction (SIE) and Model Fusion (MF). SFE utilizes a 3D-ResNet and a FCNN to extract two kinds of features in 3D image blocks and their corresponding automatic reconstruction blocks. SIE uses two LSTMs to learn sequence information hidden in 3D image blocks. MF adopts a concatenation operation and a FCNN to combine outputs from SIE. 3D-SSM can be used as a stop condition of an automatic tracing algorithm in the Ultra-Tracer framework. With its help, neuronal signals in low-TDBs can be traced by the automatic algorithm and in high-TDBs may be reconstructed by annotators. 12732 training samples and 5342 test samples are constructed on neuron images of a whole mouse brain. The 3D-SSM achieves classification accuracy rates 87.04% on the training set and 84.07% on the test set. Furthermore, the trained 3D-SSM is tested on samples from another whole mouse brain and its accuracy rate is 83.21%.


Introduction
Tracing morphologies of neurons is essential for investigating the structure and the function of neurons, exploring the working mechanism of the brain and studying the mechanism of brain diseases, such as neuron classification [1], neuron morphology analysis [2] and potential connectivity between brain circuits [3]. For small or medium scale volumetric light microscopy datasets, the BigNeuron project collected more than 30 automatic algorithms and transplanted them to the Vaa3D platform [4,5], which can visualize 3D neuron images, produce automatic reconstructions and analyze neuronal morphologies. For large-scale image datasets [6,7], Bria et al. developed a Vaa3D-Terafly open source tool to visualize, analyze and manage them [8], and Peng et al. proposed an Ultra-Tracer framework to trace their signals [9]. However, it is still a very challenging task to quickly and accurately trace neuronal morphologies in large-scale multi-neuron images of a whole mouse brain.

Open Access
Brain Informatics *Correspondence: jianyang@bjut.edu.cn Network (CNN) plays an important role in the field of image classification. Simonyan et al. proposed a VGG network [10] and it generated good results on ImageNet dataset [19] by using a small receptive field and more layers. Szegedy et al. designed an Inception network [11,20], which uses convolution kernels with different sizes to increase the diversity of features and adopts a large number of 1 × 1 convolution kernels to reduce the number of network parameters. He et al. composed a Residual neural Network (ResNet) for image recognition, which builds a deeper neural network by utilizing skip connections to jump over some layers [12].
For automatic neuron tracing on large-scale multi-neuron images, deep learning has been used to solve many image related problems. Zhou et al. developed a Deep-Neuron toolbox, which adopts deep neural networks to learn features and rules hidden in light microscopy images and traces neuronal morphologies [21]. Chen et al. presented a spherical-patches extraction and 2D multi-stream CNN based method, which can simultaneously detect all 3 types of 3D critical points in neuron microscopy images [22]. Liu et al. designed a deep learning based segmentation method to identify the location of neuronal voxels, which is capable of both enhancing neuronal signals and reducing image noise [23]. Jiang et al. proposed a method based on a ray-shooting model and a Long Short-Term Memory network (LSTM) [24], which is able to enhance weak-signal neuronal structures and remove background noise in 3D neuron images [25]. Although the above deep learning based methods have made some progresses on the multi-neuron reconstruction task, it is still very challenging to accurately and quickly trace neuronal morphologies in multi-neuron images of a whole mouse brain.
In 3D multi-neuron images of a whole mouse brain, there are some image blocks with simple morphology structures, strong signals and weak noises, on which many automatic algorithms can trace morphologies quite accurately. These image blocks (the first row in Fig. 1) are called low Tracing Difficulty Blocks (low-TDBs). While there are many other image blocks with complex morphology structures (including some bifurcations, crossing signals, etc.), weak signals or strong noises, on which most automatic algorithms trace morphologies inaccurately or incompletely. These image blocks (the second row in Fig. 1) are called high Tracing Difficulty Blocks (high-TDBs). If we can classify 3D image blocks as low-TDBs or high-TDBs, neuronal signals in low-TDBs can be traced by an automatic algorithm and in high-TDBs may be reconstructed by annotators. This interacting strategy of automatic tracing and manual annotating is capable of promoting the Ultra-Tracer framework to generate more accurate neuron reconstructions with higher efficiency. Therefore, it is interesting to study the tracing difficulty classification of 3D neuron image blocks.
In this paper, a model called 3D-SSM is designed to classify the tracing difficulty of 3D image blocks, which is based on ResNet [12], Fully Connected Neural Network (FCNN) [26] and Long Short-Term Memory network (LSTM) [24]. 3D-SSM consists of three modules:

Fig. 1 Examples of low-TDBs and high-TDBs
Structure Feature Extraction (SFE), Sequence Information Extraction (SIE) and Model Fusion (MF). SFE utilizes a 3D-ResNet and a FCNN to extract two kinds of features in 3D neuron image blocks and automatic reconstruction blocks. SIE uses two LSTMs to learn sequence information hidden in features of sequential blocks generated in SFE. MF adopts a concatenation operation and a FCNN to fuse outputs from SIE.
As shown in Fig. 2, we divide the implementation of classifying the tracing difficulty of 3D image blocks into three steps: constructing blocks, labeling blocks and classifying samples by 3D-SSM. Firstly, neuron images of a whole mouse brain are partitioned into many overlapped 3D image blocks along the gold standard reconstruction (reconstructed by professional annotators) of each neuron. Gold standard reconstruction blocks (gold blocks) and automatic reconstruction blocks (auto blocks) are parts of the gold standard reconstruction and automatic reconstruction located in each 3D image block, respectively. Then, each 3D image block is labeled as low-TDB or high-TDB by manual or an automatic algorithm. Finally, 3D image blocks and L-Measure [27] features of corresponding reconstructions are used to train and test 3D-SSM. It achieved classification accuracy rates of 87.04% and 84.07% on training set and test set, respectively. Results of the tracing difficulty classification produced by the 3D-SSM model can be used as a stop condition of an automatic tracing algorithm in the Ultra-Tracer framework. With that, neuronal signals in low-TDBs can be traced by the automatic algorithm and in high-TDBs may be reconstructed by annotators. The interaction between automatic tracing and manual reconstruction is capable of promoting the Ultra-Tracer framework to generate more accurate neuron reconstructions efficiently.
This paper has the following three contributions: • The task on classifying the tracing difficulty of 3D neuron image blocks is proposed, and its solution procedure is designed. • More than 23000 samples are constructed on two whole mouse brains, and manual labeling and automatic labeling are used to generate the label of these samples. • Based on 3D-ResNet, FCNN and LSTM, a 3D-SSM model is proposed to classify the tracing difficulty of 3D image blocks, and it has good performance on 3D image blocks of two whole mouse brains.
The rest of the paper is organized as follows. In Section 2, we introduce how to construct sample data including producing block data, extracting features and labeling samples. In Section 3, we present the 3D-SSM model including its three modules: SFE, SIE and MF. The experimental results are reported in Section 4, and conclusions and discussions are in Section 5.

Sample data
Sample data is constructed from 3D neuron images of two whole mouse brains (denoted by brain-A and brain-B, respectively), gold standard reconstructions and automatic reconstructions of marked neurons in them. These data were provided by the Southeast University-Allen Institute Joint Center. For each 3D image block, its corresponding gold standard reconstruction blocks (gold blocks) and automatic reconstruction blocks (auto blocks) are taken out for extracting features. Neuron distance [28] features between gold blocks and auto blocks are used to describe their similarity, and L-Measure [27] features of auto blocks are employed to characterize the neuronal structures in 3D image blocks. The label of a 3D image block is its tracing difficulty, low-TDB or high-TDB.

Constructing blocks
Gold standard reconstructions of neurons in brain-A and brain-B were drawn and checked by at least three professional annotators using the Vaa3D-Terafly visualization software [8], and are stored in SWC format [29] which describes the morphology of a neuron as tree structures with location, node's radius, parent node and some other attributes. There are 93 and 37 gold standard reconstructions of marked neurons in brain-A and brain-B respectively. Automatic reconstructions are generated by the APP2 algorithm [30] under the Ultra-Tracer framework, which is one of the state-of-the-art automatic tracing methods and can efficiently produce good reconstructions for many complex neurons. From the root node of each neuron (corresponding to its soma) and moving along its gold standard reconstruction with step 100 nodes, 3D neuron images are partitioned into many overlapped 3D image blocks with size 32 × 64 ×64. Gold blocks and auto blocks are parts of the gold standard reconstruction and APP2 reconstruction respectively, which are located in corresponding 3D image blocks. As illustrated in Fig. 3, three nodes with distance 100 nodes in a gold standard reconstruction are selected as centers (anchors) of three 3D image blocks (the second column in Fig. 3), and their corresponding gold blocks (the third column in Fig. 3) and auto blocks (the forth column in Fig. 3) are obtained.
There are some burrs in auto blocks which may be noises and meaningless. We delete branches with nodes less than 4, and fragments with nodes less than 4 and distance to the edge of the image block less than 4 voxels. There are some auto blocks with too few reconstruction nodes, which means that few neuronal signals in these blocks and inadequate information to evaluate the quality of reconstructions. 3D image blocks corresponding to Fig. 3 Constructing 3D image blocks, gold blocks and auto blocks from 3D neuron images, gold standard reconstructions and automatic reconstructions, respectively auto blocks with total nodes less than 11 are filtered out. Finally, 18074 samples (3D image blocks) from brain-A and 5109 samples from brain-B are obtained, and their corresponding gold blocks and auto blocks are also taken out from gold standard reconstructions and APP2 reconstructions.

Extracting reconstructions based features
The Entire Structure Average (ESA), Different Structure Average (DSA) and Percentage of Different Structures (PDS) are often used to measure the similarity between two reconstructions [28]. The following 7 distances are taken as neuron distance features: ESA 12 (the average distance of all nodes in reconstruction 1 to nodes in reconstruction 2), ESA 21 (the average distance of all nodes in reconstruction 2 to nodes in reconstruction 1), ESA (the average of ESA 12 and ESA 21 ), DSA 2 (the average distance of nodes with distance larger than two voxels), PDS 12 (the percentage of nodes in reconstruction 1 that have distance no less than two pixels to nodes in reconstruction 2) and PDS 21 (the percentage of nodes in reconstruction 2 that have distance no less than two pixels to nodes in reconstruction 1) and PDS (the percentage of nodes in reconstruction 1 or reconstruction 2 that have distance no less than two pixels to nodes in the other reconstruction). The smaller these values are, the more similar two reconstructions are. Above 7 neuron distance features for gold blocks and auto blocks can be obtained by calling the neuron distance plug-in in the Vaa3D platform. In addition, 3 morphology features: the number of bifurcations, the number of nodes and the number of fragments, are also used to describe the similarity between a pair of a gold block and an auto block. L-Measure developed by Scorcioni et al. [27], is a toolkit for extracting neuronal morphology features (http:// cng. gmu. edu: 8080/ Lm/ help/ index. htm).
There are 43 morphology features of neuron fragments, such as length, width, height, angle, etc. We select 32 L-Measure features to describe auto blocks and other 11 L-Measures features do not make sense for neuron fragments in auto blocks.

Labeling samples
Labeling samples is an important step in supervised classification tasks. Gold blocks and auto blocks are used to generate labels of 3D image blocks. For an automatic tracing algorithm, the tracing difficulty (low or high) of a 3D image block can be determined according to the similarity between its corresponding gold block and auto block. If they are very consistent, the 3D image block is labeled as low tracing difficulty (low-TDB), otherwise as high tracing difficulty (high-TDB). Since tree structures in gold blocks and auto blocks are very complex and diverse, it is quite difficult to accurately quantify their consistency. Each pair of gold block and auto block are visualized on the Vaa3D platform, and carefully compared by our annotators. After observing and comparing lots of pairs, 4 category rules are induced to label a 3D image block as low-TDB or high-TDB (https://github. com/BingooYang/Tracing-difficulty-classification-on-3D-neuron-image-block). According to these rules, 2954 3D image blocks from brain-A are labeled by one annotator and checked by other two annotators. However, manual labeling is very time consuming and automatic labeling has to be adopted. Extracted 7 neuron distance features and 3 morphology features of gold blocks and auto blocks are used to describe the similarity between each pair of gold block and auto block. If the similarity of a pair is very high, the automatic tracing algorithm performs quite well on the corresponding 3D image block and it is labeled as low-TDB, otherwise it is labeled as high-TDB. Above 2954 manually labeled pairs are used to train and test a FCNN model to classify the similarity of all gold block and auto block pairs. The FCNN model consists of four linear layers, and the number of nodes from the first layer to the fourth layer is 50, 30, 20 and 2, respectively. It produces accuracy rates of 96.9% and 96.4% on the training set and test set, respectively. So the trained FCNN has good performance and can be used to generate the label of remaining 3D image blocks. It is worth noting that this model utilizes gold standard reconstructions to learn the label of 3D image blocks. But more often, we don't have a gold standard reconstruction and the trained FCNN can not solve the tracing difficulty classification task on 3D image blocks.

Method
A deep learning based 3D-SSM model is designed to classify the tracing difficulty of 3D image blocks. As illustrated in Fig. 4, 3D-SSM consists of three modules: Structure Feature Extraction (SFE), Sequence Information Extraction (SIE) and Model Fusion (MF). In SFE, a 3D-ResNet and a FCNN are trained by taking 3D image blocks and 32 L-Measure features of auto blocks as inputs, and their parameters are saved. In SIE, two LSTMs are adopted to extract the sequence information hidden in 3D image blocks and auto blocks, and the network parameters are also saved. In MF, the outputs of 3D image blocks and auto blocks produced in SIE are concatenated and taken as the inputs of a FCNN, and SFE, SIE and the FCNN are trained together.

SFE of 3D-SSM
SFE contains two networks, a 3D image block based 3D-ResNet and an auto block based FCNN, which are used to extract structure features of 3D image blocks and auto blocks, respectively. The 3D image block based 3D-ResNet is to label a 3D image block as low-TDB or high-TDB by using feature maps of 3D images. The 3D-ResNet is designed on a usual ResNet, and contains convolution, pooling, batch normalization, dropout, skip connections and so on. Its inputs are 3D image blocks, output is a two-dimensional feature vector and its network structure is given in Table 1. The auto block based FCNN is to label a 3D image block as low-TDB or high-TDB by using features of auto blocks. That is to say, the tracing difficulty of a 3D image block is evaluated only by morphology features of its corresponding auto blocks. The FCNN is composed of three linear layers, and the number of nodes from the first layer to the third layer is 100, 50 and 2, respectively. Its inputs are extracted 32 L-Measure features and output is a two-dimensional feature vector.
It should be noted that outputs of the 3D-ResNet and the FCNN are set as 2 dimensional vectors for the following two purposes. The first one is to calculate their cross entropy loss with the one-hot encoding labels and so as to train the 3D-ResNet and the FCNN separately. The second one is to concatenate them with one-hot encoding sequence labels of 3D image blocks in the SIE module.

SIE of 3D-SSM
SIE consists of two LSTMs for extracting sequence information hidden in 3D image blocks. Both of them have two layers and each layer consists of 10 hidden nodes. The sequence relation among 3D image blocks should be defined explicitly. Two 3D image blocks are defined as adjacent blocks in a sequence if they satisfy the following conditions: (1) The distance between their anchors is less than or equal to 100 voxels. (2) The node distance between their corresponding nodes on the gold standard reconstruction is 100 nodes. Starting from any 3D image block and according to the above defined adjacency relation, many block sequences with different lengths can be obtained from our 3D image blocks.
Intuitively, most adjacent blocks have the same tracing difficulty. As shown in Fig. 5, the red line and the yellow line in the left picture are the gold standard reconstruction and APP2 reconstruction of a neuron, and red dots and blue dots in the right picture represent locations of low-TDBs and high-TDBs, respectively. From Fig. 5, it can be seen that most red dots or blue dots are continuously scattered on the gold standard line. That is to say, adjacent 3D image blocks have the same tracing difficulty. Furthermore, this observation is tested on 18074 samples from brain-A. For each 3D image block, a block sequence with length 2 is generated, and the probability that two  Table 1 The structure of 3D-ResNet * Unit-A(n) consists of a 3 ×3× 3 convolution with a 1 ×1× 1 stride and n channels, a batch normalization, and an activation function (ReLU). Unit-B(n) has the same structure as unit-A(n) but a different convolution step size 2 ×2×2 blocks in all 18074 sequences have the same label is 60.55%. Based on SFE, SIE learns the sequence information hidden in 3D image blocks by two LSTMs, 3D image block based LSTM and auto block based LSTM. Let X I = {X I1 , X I2 . . . X I(n−1) , X In } be n 3D image blocks, X A = {X A1 , X A2 . . . X A(n−1) , X An } be 32 L-Measure features of their corresponding n auto blocks, and L = {L 1 , L 2 . . . L n−1 , L n } be their one-hot encoding labels. Suppose that X ′ Is = (X ′ I1 , X ′ I2 . . . X ′ I(s−1) , X ′ Is ) and X ′ As = (X ′ A1 , X ′ A2 . . . X ′ A(s−1) , X ′ As ) are two sequences with length s constructed from X I and X A , and L ′ s = (L ′ 1 , L ′ 2 . . . L ′ s−1 , L ′ s ) be their one-hot encoding labels, where X ′ Ii ∈ X I and X ′ Ai ∈ X A ( i = 1, 2, · · · , s ). Denote the 3D-ResNet output of X

MF of 3D-SSM
Model fusion (MF) is used to integrate features and sequence information in 3D image blocks and auto blocks. MF is composed of a concatenation operation and a FCNN to fuse the output features of SIE. The FCNN has two linear layers, and the number of nodes in the first (1) O Is = W Is L ′ Is and O As = W As L ′ As , layer and the second layer are 30 and 2, respectively. MF can be expressed by the following formula: where cat is a concatenation operation, W M is learnable parameters of FCNN, and O M is the output of MF.

Experiment
In this section, we validate the performance of the automatic labeling algorithm and each module of the proposed 3D-SSM model. Furthermore, modules of 3D-SSM trained on samples from brain-A are used to classify samples from brain-B.

Experimental setup
Our experimental data includes 18074 3D image blocks from brain-A and 5109 3D image blocks from brain-B, and their corresponding gold blocks and auto blocks. We extract 7 neuron distance features, 3 neuronal morphology features and 32 L-Measure features from gold blocks and auto blocks, and manually label 2954 3D image blocks from brain-A as low-TDB or high-TDB. Using these data and with the help of publicly available packages Pytorch [31] and Scikit-learn [32], several algorithms for automatic labeling and three modules of 3D-SSM are implemented on two NVIDIA P5000 GPUs with 16GB memory. We run all algorithms five times and report their average accuracy rates and F1 scores with corresponding standard deviations.

Results of automatic labeling
2954 pairs of a gold block and an auto block with manual label are randomly divided into 70% (2068) training samples and 30% (886) test samples. 7 neuron distance features and 3 neuronal morphology features are used to describe the similarity between each pair of gold block and its corresponding auto block. According to the similarity of each pair, their corresponding 3D image block is labeled as low-TDB or high-TDB. We implement FCNN, logistic regression (LR) and Support Vector Machines (SVM) to label a 3D image block, and their average accuracy rates and F1 scores (standard deviation) are given in Table 2.
From Table 2, it can be seen that all three algorithms have quite good performance (about 96% accuracy rates and F1 scores) on both training and test sets. Among them, FCNN generates the best training and test results and is selected to label 3D image blocks without manual labels. For 15120 (18074−2954) 3D image blocks from brain-A, FCNN labels 8423 blocks as low-TDB and 6697 blocks as high-TDB. And for 5109 3D image blocks from brain-B, it labels 3416 blocks as low-TDB and 1693 blocks as high-TDB.

Results of the SFE module
We take 5342 (29.55%) samples from 23 neurons of brain-A as test set, and 12372 samples from other 70 (93−23) neurons of the same brain as training set. For 3D image block based tracing difficulty classification, the training set is enhanced by rotating 3D images with 90, 180 and 270 degrees along the X-axis direction, and 37116 new training samples are obtained.
Usual 2D Resnet [12], MobileV2 [33], DenseNet [34] and SENet [35] are extended to three dimensions (3D) and implemented to classify the tracing difficulty of 3D image blocks. Adam optimizer [36] with β 1 = 0.9 and β 2 = 0.99 is applied to optimize these models, and their batch size, initial learning rate and weight decay are set to 30, 0.001 and 0.01, respectively. As given in Table 3, 3D-ResNet achieves the best test accuracy rate, so it is selected as a base method in the 3D-SSM model and its trained parameters are saved for the training of 3D-SSM.
For auto block based tracing difficulty classification, 32 L-Measure features of auto blocks are used as the input of an algorithm to generate the label of corresponding 3D image blocks. FCNN, LR and SVM are implemented on these features of training set and test set, and their results are given in Table 4. From Table 4, it can be seen that FCNN has the best performance, and so it is adopted as the algorithm for auto blocks based tracing difficulty classification and its parameters are saved for the training of the 3D-SSM model.
From Table 3 and Table 4, we can see that the 3D image block based 3D-ResNet has better performance than the auto block based FCNN. Since 3D-ResNet and FCNN utilize feature map of images and morphology features of reconstructions respectively, labels generated by them might be quite different. We visually check their misclassified samples in the test set, and summarize three main types of errors. As illustrated in Fig. 6, 3D-ResNet does badly on 3D image blocks with simple and weak signals (yellow blocks in Fig. 6), while FCNN mainly makes   mistakes on 3D image blocks with winding signals (red blocks in Fig. 6) or close signals (blue blocks in Fig. 6) from different neuronal segments. The possible reason is that 3D-ResNet pays more attention to the overall structure of neuronal signals and FCNN focuses on the complexity of the geometry of neuronal signals. Hence, the fusion of the 3D image block based 3D-ResNet and the auto block based FCNN would provide more discriminative features for the classification task, which will be discussed in subsection 4.5.

Results of the SIE module
Two LSTMs are utilized to learn sequence information hidden in 3D image blocks, one is for the output of the 3D image block based 3D-ResNet (3D-ResNet-LSTM) and the other is for the auto block based FCNN (FCNN-LSTM). Test accuracy rates of 3D-ResNet-LSTM and FCNN-LSTM with sequence length varying from 1 to 5 are given in Fig. 7. From Fig. 7, it can be seen that LSTMs with sequence length 3 can improve accuracy rates of 3D-ResNet and FCNN about 0.8 and 0.5 percentage points, respectively. If the sequence length is larger than 3, 3D image blocks in a sequence span a large area. In this case, the complexity of neuronal morphology structures and the signal-to-noise ratio (SNR) of signals might change greatly. So the tracing difficulty of different 3D image blocks in the sequence might be different (low or high). This explanation is verified by the downward trend of curves in Fig. 7.

Results of the 3D-SSM model on brain-A
The trained 3D-ResNet and FCNN in SFE, and two trained LSTMs in SIE are used to construct the 3D-SSM model (model fusion), and their trained parameters are adopted to train parameters in the FCNN of MF. Table 5 gives accuracy rates of the 3D-SSM model with different sequence lengths in LSTMs. Its best results are 87.04% and 84.07% on the training set and test set, respectively. We visually check all correctly classified test samples by 3D-SSM, and summarize them into five main types: type-A, type-B, type-C, type-D and type-E. Type-A are 3D image blocks with only one neuronal segment and without weak signal, type-B are blocks with a few neuronal segments and a high SNR, type-C are blocks with much more neuronal segments and interlaced neuronal signals, type-D are blocks with many weak neuronal signals and complex neuronal morphologies, and type-E are blocks with a soma. The former two types belong to low-TDB, the latter three types are high-TDB, and type-C and type-D compose the majority of the high-TDB category. Five examples of each type are demonstrated in Fig. 8. It can be seen that the 3D-SSM model correctly classifies 3D image blocks with different directions, lengths, SNRs, numbers of bifurcations and numbers of neuronal segments.

Results of the 3D-SSM model on brain-B
In order to verify the generalization of SEF, SIF and the 3D-SSM model, we use the best parameters (among five runs) trained on samples from brain-A to classify samples from brain-B. Table 6 gives results of the trained 3D-ResNet-LSTM, FCNN-LSTM and 3D-SSM mode on samples from brain-B. Comparing Fig. 7, Table 5 and Table 6, it can be seen that 3D-ResNet-LSTM has the same performance on brain-B, FCNN-LSTM has little worse performance on brain-B, and 3D-SSM even has better F1 score on this brain. While the sequence length equals 3, 3D-SSM produces an accuracy rate 83.21%. These experimental results show that 3D-SSM has good performance on classifying 3D image blocks from nontraining whole mouse brains.

Conclusions and discussions
In this paper, we construct 3D image blocks, gold blocks, and auto blocks from 3D neuron images of two whole mouse brains, and extract 7 neuron distance features, 3 neuronal morphology features and 32 L-Measure features from gold blocks and auto blocks. 3D image blocks are labeled by manual or a FCNN, and the sequence relation among them is built. More importantly, a 3D-SSM model is proposed to classify the tracing difficulty of 3D image blocks, which has three modules: SEF, SIE and MF. SEF consists of a 3D-ResNet and a FCNN for extracting structure features of 3D image blocks and auto blocks. SIE adopts two LSTMs to extract sequence information hidden in 3D image blocks, and MF fuses different features in SIE. These modules are validated on more than 20000 samples from two whole mouse brains. In addition, three types of misclassified samples by SFE and five types of correctly classified samples by 3D-SSM are summarized. Classification results on the tracing difficulty of 3D image blocks by 3D-SSM can be used as a stop condition for an automatic tracing algorithm in the Ultra-Tracer framework, which is an important factor to realize the interaction between automatic tracing and manual reconstructing.
Although the proposed 3D-SSM model performs well on classifying the tracing difficulty of 3D image blocks, it is still hard to correctly classify 3D image blocks with weak signals, close signals, wind signals, complex neuronal structures and so on. In fact,  the tracing difficulty classification task on 3D image blocks is much more difficult than traditional image classification tasks. Firstly, 3D image blocks have a much higher dimension, more parameters need to be learned, and model optimization is more difficult. Secondly, the number of neuronal fragments,  signal strengths, signal quantity, signal directions, signal shapes and so on are quite different for different 3D image blocks, so it is hard for a model to learn enough advanced features which can well distinguish low-TDBs and high-TDBs. Although it is difficult to classify the tracing difficulty of 3D image blocks, this task is important for interacting between automatic tracing and manual reconstructing, and we will further explore it from the perspective of producing more accurate data and designing better models.