Binding mechanism of the matrix domain of HIV-1 gag on lipid membranes

Specific protein-lipid interactions are critical for viral assembly. We present a molecular dynamics simulation study on the binding mechanism of the membrane targeting domain of HIV-1 Gag protein. The matrix (MA) domain drives Gag onto the plasma membrane through electrostatic interactions at its highly-basic-region (HBR), located near the myristoylated (Myr) N-terminus of the protein. Our study suggests Myr insertion is involved in the sorting of membrane lipids around the protein-binding site to prepare it for viral assembly. Our realistic membrane models confirm interactions with PIP2 and PS lipids are highly favored around the HBR and are strong enough to keep the protein bound even without Myr insertion. We characterized Myr insertion events from microsecond trajectories and examined the membrane response upon initial membrane targeting by MA. Insertion events only occur with one of the membrane models, showing a combination of surface charge and internal membrane structure modulate this process.


Sample-size estimation
• You should state whether an appropriate sample size was computed when the study was being designed • You should state the statistical method of sample size computation and any required assumptions • If no explicit power analysis was used, you should describe how you decided what sample (replicate) size (number) to use Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission:

Replicates
• You should report how often each experiment was performed • You should include a definition of biological versus technical replication • The data obtained should be provided and sufficient information should be provided to indicate the number of independent biological and/or technical replicates • If you encountered any outliers, you should describe how these were handled • Criteria for exclusion/inclusion of data should be clearly stated • High-throughput sequence data should be uploaded before submission, with a private link for reviewers provided (these are available from both GEO and ArrayExpress) Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission: This section does not apply to the theoretical technique we employ for our studies, molecular dynamics simulation. We discuss our statistical analysis approach and simulation replicates in following sections. The membrane model selection and protein structure used in this study are described in the Methods section in the main manuscript, and Table 1 in the Results section.

Statistical reporting
• Statistical analysis methods should be described and justified • Raw data should be presented in figures whenever informative to do so (typically when N per group is less than 10) • For each experiment, you should identify the statistical tests used, exact values of N, definitions of center, methods of multiple test correction, and dispersion and precision measures (e.g., mean, median, SD, SEM, confidence intervals; and, for the major substantive results, a measure of effect size (e.g., Pearson's r, Cohen's d) • Report exact p-values wherever possible alongside the summary statistics and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.
Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission: (For large datasets, or papers with a very large number of statistical tests, you may upload a single table file with tests, Ns, etc., with reference to sections in the manuscript.)

Group allocation
• Indicate how samples were allocated into experimental groups (in the case of clinical studies, please specify allocation to treatment method); if randomization was used, please also state if restricted randomization was applied • Indicate if masking was used during group allocation, data collection and/or data analysis Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission: We described the systems, size, and number of simulation runs per system under the Methods section as well as in Table 1 in the main manuscript and  Table S2 in Supplementary File 1.
As standard practice, MD simulation trajectories are run in pairs or triplicates to ensure independent trajectories for posterior statistical analysis. We run two independent trajectories for the short protein-membrane interactions, as listed on Table 1. Three replicates were run for the microsecond simulations that explore insertion of the lipidated tail of the protein into the membrane; and two for the membrane systems with three protein units on the surface, as listed on Table S2.
The data analysis as described in detail on Table S3, was computed over equilibrated sections of the simulation trajectories, unless the analysis shows the time evolution of a given property. Values reported in this manuscript were blocked averaged every 10ns over the last 100-300ns of trajectory, and the standard error of the measurement reported along with the average. In the case of timeseries shown in the figures, we show the raw data in a faded hue, while the moving average is shown in bold colors and computed using 5ns blocks.