Effect of an antimitotic agent colchicine on thioacetamide hepatotoxicity.

In an earlier study we established that timely and adequate tissue repair response following the administration of a six-fold dose-range of thioacetamide (TA; 50, 150, and 300 mg/kg) prevented progression of injury and led to recovery and animal survival. Delayed and attenuated repair response after the 600 mg/kg TA dose resulted in a marked progression of injury and 100% lethality. The objective of the present study was to further scrutinize this concept in an experimental protocol in which we hypothesized that a selective ablation of the tissue repair response should lead to lethality from the nonlethal, moderately toxic doses of 150 and 300 mg/kg TA. In this study we investigated the effect of the antimitotic agent colchicine (CLC, 1 mg/kg) on the outcome of TA hepatotoxicity. Male Sprague-Dawley rats (175-225 g) were injected intraperitoneally (ip) with 150 and 300 mg/kg TA. We assessed liver injury by serum enzyme elevations and histopathology. Tissue regeneration response was measured by 3H-thymidine incorporation into hepatonuclear DNA and by proliferating cell nuclear antigen (PCNA) assay. S-Phase stimulation, as indicated by 3H-thymidine incorporation, was noted at 36 and 48 hr following the administration of 150 mg/kg TA, whereas with the 300 mg/kg TA S-phase stimulation was elicited at 48 hr following treatment. Therefore, two doses of CLC (30 hr and 42 hr, 1 mg/kg, ip) were administered to the 150 mg/kg treated group while a single dose of CLC (42 hr, 1 mg/kg, ip) was administered to the 300 mg/kg group. CLC treatment resulted in 100% lethality in both groups. Thus, CLC administration converted nonlethal doses into lethal doses. The 150 mg/kg TA dose was then chosen to further investigate the underlying mechanism. Rats treated with TA alone recovered from injury by 36-48 hr while CLC treatment resulted in a progression of injury as indicated by serum enzyme elevation and histopathology. Tissue repair, as evidenced by 3H-thymidine incorporation and PCNA studies explained this dichotomy. Antimitotic intervention with CLC resulted in a significantly diminished repair response leading to unrestrained progression of injury and lethality even from nonlethal doses. This model demonstrates the critical role of tissue repair response in determining the final outcome of toxicity.


Background and motivation
Sequence alignment tools are essential to biological research [see, e.g. (1), for a survey of multiple sequence alignment methods]. In addition to merely the residues/ nucleotides, biologists often possess more knowledge regarding function, structure or conserved patterns of the sequences to be analyzed. It is generally desirable to have such information incorporated into an alignment procedure, so that the alignment result can be more biologically meaningful. For example, functionally important sites are generally expected to be aligned together, but a typical alignment tool often fails to achieve this if the sequence similarity is low. Imposing constraints representing such information turns out to be an effective manner to incorporate biological knowledge into an alignment tool.
Motivated by such demand, Tang et al. (2) formulated the constrained multiple sequence alignment problem, where each constraint is a single residue/nucleotide. They considered alignment of RNase sequences, which are known to have a sequence of conserved residues His (H), Lys (K) and His. Using H, K, H as constraints, in the resulting constrained alignment each of these three residues can be found aligned together in a column of the alignment, appearing in the order as specified. Chin et al. (3) then proposed an improved algorithm for pairwise alignment and an approximation algorithm for multiple alignment. It is also noted that there have been other formulations regarding alignment with constraints proposed from different perspectives with various approaches (4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14).
Conserved sites of a protein/RNA/DNA family are often of several residues/nucleotides long. For these patterns, the original formulation in (2) is not expressive enough. In addition, such patterns may not appear in the exact form in general. Consequently, Tsai et al. (15) proposed a generalized formulation and algorithm, where each constraint is a (usually short) string pattern allowing mismatches. Lu and Huang (16) then proposed a space efficient algorithm for this formulation. Web-based systems, MuSiC (15) (available at http:// genome.life.nctu.edu.tw/MUSIC) and MuSiC-ME (16) (available at http://genome.life.nctu.edu.tw/MUSICME), were also developed; from now on these two systems will be referred to as MuSiC jointly. With the aid of MuSiC, Tsai et al. (15) and Lu and Huang (16) successfully *To whom correspondence should be addressed. Tel: þ886-3-5712121; Fax: þ886-3-5729288; Email: cllu@mail.nctu.edu.tw ß 2007 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. identified a fragment in the 3 0 untranslated region (3 0 -UTR) of a SARS (severe acute respiratory syndrome) coronavirus sequence that can fold into a pseudoknot, which is potentially responsible for self-replication of the virus. Indeed, since its release, MuSiC has been found useful in, e.g. detection of functionally and/or structurally important residues/motifs in sequences (17,18), prediction of RNA pseudoknotted structures (15,19,20), prediction of protein structures (21) and so on.
There are, however, formulations of many biologically significant patterns beyond the capability of MuSiC. For example, many function-related protein sites as those collected in the PROSITE database (22) are expressed in regular expressions, which cannot be modeled using the substring-with-mismatch formulation of constraints implemented in MuSiC. An example of regular expression patterns is the EGF-like domain signature 2 (EGF_2, PS01186 in PROSITE): which is related to the initiation of a signal transduction that results in DNA synthesis and cell proliferation. The meaning of this pattern is that, the first residue is Cys, followed by one residue of any kind, then a Cys, followed by two residues of any kind, then a Gly or Pro, etc. Regular expressions are also convenient in describing variable ranges between patterns or between blocks within a pattern, which is necessary for some single patterns themselves, and useful in applications where different patterns are expected to exhibit proximity in their occurrences. In the above example of EGF_2, the 'x(4,8)' symbol preceding the last Cys indicates a range of length varying from 4 to 8 between a residue of [F, Y or W] (Phe, Tyr or Trp) and that last Cys. Due to the usefulness of regular expressions in describing biological patterns, an enhanced web server, RE-MuSiC (Multiple Sequence Alignment with Regular Expression Constraints), capable of handling regular expression constraints, is developed.
DIALIGN (8,9,12,13) (http://dialign.gobics.de/) is a well-known web server that can accept user-defined constraints as anchor points. It can be noted that the constraint formulation of DIALIGN and the one of RE-MuSiC are significantly different. In DIALIGN, a constraint consists of the exact positions of a pair of equal-length segments on two of the sequences, where these two segments are expected to be aligned together. Conflicts of constraints, if any, are resolved according to a weight function defined on the segment pairs. This formulation is more similar to the one of Myers et al. (6). On the other hand, in RE-MuSiC, a constraint is a regular expression pattern. Each pattern may occur many times in a sequence, where each occurrence needs not have the same length. The occurrences to be aligned together so as to satisfy the constraints will be those that can make the overall alignment optimized.

Using RE-MuSiC
RE-MuSiC provides an intuitive user interface ( Figure 1). The user enters or pastes the input sequences (in FASTA format) in the largest blank field. The format for the constraints follows the PROSITE pattern format (please see the help page at http://140.113.239.131/RE-MUSIC/help.html for details). Each constraint is put within quotes, and adjacent constraints are separated by space characters. The user needs to specify whether the input sequences are proteins or DNA/RNA. The preferred scoring matrix may be chosen, and the gap open/extension penalties can be assigned. The user can also enter an email address so that a hyperlink to the alignment result will be sent via email. The output page shows the constrained alignment with the regions for the satisfactions of the constraints shaded in yellow (Figure 2b). On the output page the user can also choose to download the alignment result in FASTA format or ClustalW format.

METHODS
The regular expression constrained sequence alignment problem was originally formulated by Arslan (23). The algorithm proposed in (23) is for pairwise alignment with a single constraint. In (24) Arslan extended the algorithm in (23) to support multiple alignment with multiple constraints. The algorithm proposed in (24) may be implemented. It computes mathematically optimal constrained alignments. Unfortunately, the time complexity is extremely high, involving an exponential multiplicative factor in addition to the exponential time complexity for optimal (unconstrained) MSA computations. Even for pairwise alignment with multiple constraints, its worst case time and space requirements are intensive. In addition, the algorithms in (23,24) cannot find in the resulting alignment the regions responsible for the satisfactions of the constraints; only the alignment score, without the alignment itself, is reported. But being able to report alignments is important for a web server. It is therefore necessary to propose a solution more suitable for practical applications.
For pairwise alignment with one regular expression constraint, in a previous study (25) we have proposed an algorithm, which is more efficient both in time and in space than the one in (23). Furthermore, the alignment in addition to the score can be reconstructed without worsening the time and space complexity. In this work we extend the algorithm in (25) to support multiple constraints and multiple sequences, as required in RE-MuSiC. The resulting algorithm is more efficient than the one in (24) for pairwise alignment with multiple constraints. To deal with multiple sequences, a progressive method is implemented, using our improved pairwise algorithm as the kernel. For details of the algorithm the reader is referred to the supplementary material (available at http://140.113.239.131/RE-MUSIC/ RE_MuSiC_method.pdf).

Protein sequences with active site residues
The glutathione binding site (G-site) on glutathione S-transferase (GST) had been found to have conserved architectures across species (26). The chemical natures of their residues acting as G-site ligands and interactions facilitated with glutathione are also analogous (26). In a reasonable alignment of GST protein sequences, therefore, the residues for the G-site are expected to be aligned together. A structural superposition of the crystal structures of GST proteins from different species also suggests that most of these G-site residues should be aligned together (26). The sequence identity of those GST proteins from different species, however, is quite low; for example, it is reported in (26) that the pairwise sequence identity between the A. thaliana GST and each of other six non-plant GSTs is no more than 20.2%. In such a case, interfered by the low-similarity regions, it would be difficult for a typical alignment tool to align the important residues well. An experiment is therefore undertaken to examine the performance of a typical alignment tool in this case, as well as to demonstrate how RE-MuSiC can be used to produce a more reasonable alignment.
In this experiment we analyze three GST proteins: (i) AtGST: a phi class GST from plant A. thaliana In the resulting alignment, RE-MuSiC annotated the region for the satisfaction of the constraint with a yellow block. It can be seen that the G-site residues are aligned properly, as desired. For both tools, the default parameter settings are adopted.
(PDBID: 1GNW); (ii) SjGST: an alpha class GST from non-mammalian S. japonicum (flat worm) (PDBID: 1M99); (iii) SsGST: a pi class GST from mammalian S. scrofa (pig) (PDBID: 2GSR). These sequences are first aligned using ClustalW (27). The result is shown in Figure 2a, where active site residues shared by these GSTs are boxed. It can be seen that, part of the G-site residues failed to be aligned together, due to the low sequence similarity among these GST proteins. By querying PROSITE with the three proteins, it is found that they all share the pattern PS00006 ([ST]-x(2)- [DE]). Using this pattern as constraint, RE-MuSiC is applied to align the sequences again. The result is shown in Figure 2b. As expected, the common pattern is aligned together. Meanwhile, the G-site residues are aligned properly, as desired. These suggest that, with some information about common patterns, RE-MuSiC is more reliable to produce alignments in which biologically important residues can be lined up, which is particularly important when the sequence identity is low. Being more reliable in aligning together important residues, RE-MuSiC may also be applied to align an unknown sequence with other sequences whose relevant residues are known, thus providing a convenient and cheap way for a preliminary prediction of the residues in question on the unknown sequence. Note also that, in this experiment, the knowledge about the active site residues are not utilized in constructing the alignment; the constraints do not involve the active site residues themselves. Such a property is useful when the residues to be predicted are not expected to be conserved in the sequence level.

RNA sequences with phylogenetically conserved pseudoknots
There is considerable evidence that suggests phylogenetically conserved pseudoknots found in the 3 0 -UTRs of various coronaviruses are involved in RNA replication of these viruses (28). In an alignment of the 3 0 -UTR sequences of coronaviruses, therefore, it is desirable if these pseudoknots can be aligned together. However, it is often the case that the sequence identity among the coronaviruses from different groups is low. It is not an easy task for a typical alignment tool to align together the conserved pseudoknots. In this experiment, we demonstrate that RE-MuSiC can be helpful in this situation.
First, ClustalW is applied to align these coronavirus sequences. The result is shown in Figure 3a. Not surprisingly, since the sequence identity is low, Figure 3. Results of the experiment on coronaviruses with 3 0 -UTR pseudoknots. (a) A partial view of the alignment produced by ClustalW. The shaded regions, corresponding to the phylogenetically conserved pseudoknots, are not aligned well. (b) A partial view of the alignment produced by RE-MuSiC. The consensus of the pseudoknots on the four coronaviruses involves variable ranges between residues. RE-MuSiC has a constraint formulation flexible enough to express this consensus. As expected, the regions for the pseudoknots are aligned properly by RE-MuSiC. For both tools, the default parameter settings are adopted. the phylogenetically conserved pseudoknots (shaded regions) are not aligned well. In (28), predicted secondary structures of the pseudoknots found in the 3 0 -UTR of various coronaviruses are given. A consensus of the pseudoknots is to be taken. Since, in general, loops in pseudoknots are less conserved, to enhance flexibility, we exclude loop regions nucleotides from the consensus. Then the consensus of the pseudoknots can be described as 'x(5)-C-U-x(4)-C-x(15,16)-U-G-x(2)-A-x(5,7)-G-x(4)-A-G-x(7,10)-U-x(3)-A-x (5).' Using this consensus as the constraint, RE-MuSiC is applied to align these 3 0 -UTR sequences again. In Figure 3b, the pseudoknot regions on these coronaviruses can be seen to have been aligned properly. This demonstrates that RE-MuSiC can be used to help locate fragments that are conserved in structure. Actually, this property, being a common advantage of the MuSiC series, had been utilized to predict the pseudoknot in the 3 0 -UTR of the SARS-TW1 coronavirus by aligning the 3 0 -UTR of SARS-TW1 with those of some other coronaviruses whose pseudoknot regions are known (15,16). RE-MuSiC further makes it possible to provide the flexibility of variable ranges between conserved nucleotides or regions in constraints, which is necessary for describing the whole consensus of the pseudoknot in this experiment. This is a significant advance over previous generations of MuSiC.

SUMMARY
Imposing constraints is an effective manner to incorporate biological knowledge into an alignment tool. Previous versions of MuSiC do not support many biologically significant patterns. RE-MuSiC adopts regular expressions as its constraint formulation, which is useful in expressing PROSITE patterns or structural elements that often involve variable ranges between conserved parts. The algorithm underlying RE-MuSiC represents an improvement over the previously proposed algorithm, and is more appropriate for implementation in a webserver. Experiments on GST proteins and on coronaviruses with phylogenetically conserved pseudoknots demonstrate that, with additional knowledge incorporated, RE-MuSiC is able to produce meaningful alignments in which important residues or structural elements can be aligned properly, even if the similarity among input sequences is low. Such ability is also useful for prediction purposes.