High quality, high throughput, and low-cost simultaneous video recording of 60 animals in operant chambers using PiRATeMC

Background: The development of Raspberry Pi-based recording devices for video analyses of drug self-administration studies has been shown to be promising in terms of affordability, customizability, and capacity to extract in-depth behavioral patterns. Yet, most video recording systems are limited to a few cameras making them incompatible with large-scale studies. New method: We expanded the PiRATeMC (Pi-based Remote Acquisition Technology for Motion Capture) recording system by increasing its scale, modifying its code, and adding equipment to accommodate large-scale video acquisition, accompanied by data on throughput capabilities, video fidelity, synchronicity of devices, and comparisons between Raspberry Pi 3B + and 4B models. Results: Using PiRATeMC default recording parameters resulted in minimal storage (~350MB/h), high throughput ( < ~120 seconds/Pi), high video fidelity, and synchronicity within ~0.02 seconds, affording the ability to simultaneously record 60 animals in individual self-administration chambers for various session lengths at a fraction of commercial costs. No consequential differences were found between Raspberry Pi models. Comparison with existing method(s): This system allows greater acquisition of video data simultaneously than other video recording systems by an order of magnitude with less storage needs and lower costs. Additionally, we report in-depth quantitative assessments of throughput, fidelity, and synchronicity, displaying real-time system capabilities. Conclusions: The system presented is able to be fully installed in a month ’ s time by a single technician and provides a scalable, low cost, and quality-assured procedure with a high-degree of customization and synchro-nicity between recording devices, capable of recording a large number of subjects and timeframes with high turnover in a variety of species and settings.


Introduction
The use of animal models in operant self-administration paradigms is critical to the advancements in the characterization of addiction behaviors, the mapping of neurobiological mechanisms, and their translation into human treatments and interventions (Spanagel, 2017).Animal models of drug self-administration typically use variables such as the number of infusions, active lever press, resistance to punishment, or breaking point to measure psychological construct such as "craving", "motivation", "escalation", and "vulnerability" (Belin et al., 2009;de Guglielmo et al., 2023, preprint, https://doi.org/10.7554/eLife.90422.1;Kallupi et al., 2022Kallupi et al., , preprint, doi:10.1101Kallupi et al., /2022.07.26.501618.07.26.501618).However, these simple and predefined behavioral measures are limited in their capability to provide in-depth information about the organization and the complexity of behavior during an operant session.
The advancements in machine-learning-based video analyses by way of pose-estimation provides an opportunity to identify underlying behavioral motifs and complex species-specific behavioral changes that can better predict behaviors of interest with such developed methods as DeepLabCut (DLC) (Mathis et al., 2018;Nath et al., 2019), Variational Animal Motion Embedding (VAME) (Luxem et al., 2022), Social LEAP Estimates Animal Poses (SLEAP) (Pereira et al., 2022), and Simple Behavioral Analysis (SimBA) (Goodwin et al., 2024).The utility of such automated methods substantially accelerates data acquisition from experimental recordings by overcoming the time-consuming and low inter-rater reliability of manual annotations by experimenters.Additionally, they can identify fixed-action patterns that human observers might not detect or find difficult to track across many frames over extended hours of video.The transparency and availability of models generated by these machine-learning-based methods to investigate different species in varying settings and behaviors can be shared and executed between labs, many of which ensuring cross-compatibility between methods (e.g.SLEAP or DLC results able to be uploaded into SimBA), leads to more robust constructs and defined parameters to confirm such constructs.Similarly, the reported materials utilized such as cameras and lenses further ensures reproducibility and execution of predeveloped models, as well as generates a demand for the development of such models.But current video recording systems are either expensive proprietary technologies (Med Associates, The Imaging Source, GoPro, Point Grey, Basler, etc.) or inexpensive open-source systems (usually based on Arduino and Raspberry Pi) that currently lack demonstrated scalability, with most systems validating no more than 1-16 cameras recorded simultaneously in a variety of species and systems (Saxena et al., 2018;Singh et al., 2019;Weber andFisher, 2019, preprint, doi:10.1101/596106;Hou and Glover, 2022;Marcus et al., 2022;Centanni and Smith, 2023).
To address this gap in the literature, we tested the capability, throughput, fidelity, and synchronicity of videos of the PiRATeMC (Pibased Remote Acquisition Technology for Motion Capture) protocol by Centanni and Smith (2023).This study also provides modifications of the code to simultaneously record 60 animals in independent operant chambers, along with functional comparisons between the 3B+ and 4B models of Raspberry Pis, a 24 h session for test of system stability, and system comparisons with other Pi-based and commercially available alternatives.A link to a GitHub providing a tutorial in greater detail for transparency and replication of system design, as well as the code block for FFmpeg tests and analyses, is also provided.

Methods and materials
Throughout the present paper, any file referenced in the PiRATeMC system will be in italics, while any text that is a code typed into a terminal of a computer or the Raspberry Pis (RPi) will utilize the same convention as Centanni and Smith (2023) by bolding and italicizing as such: terminal_code.

PiRATeMC system
The PiRATeMC (Pi-based Remote Acquisition Technology for Motion Capture) system as outlined in Centanni and Smith (2023) for a cluster design was implemented.This includes the basic components, such as the RPis and their accessories, the network switch, and remote control, as well as the codes necessary to allow these three devices within the system to interface.An accessible instructional manual may be found on their GitHub (https://github.com/alexcwsmith/PiRATeMC/blob/master/README.md).In short, this requires assembly of the RPi parts, flashing a micro-SD card with a modified version of the Raspberry Pi OS "Buster" known as "PiCamOS", installation of the latest Ubuntu operating system (OS) to the remote controller, lines of code for packages to be installed and files to be copied within the remote controller to interface with the network switch, connecting to the RPi and updating the recording file with your specifications, updating variables in the .bashrc of the RPi with remote controller information, updating the .bashrc file of the remote controller with the IP addresses to all RPis on its network, and running cssh with all RPi IP addresses in a terminal to interface with each connected RPi.

Coding alterations and equipment additions
In terms of coding, the call variables $REMOTEPASS and $REMO-TEVIDPATH present in the .bashrcand recordVideo.shfiles were not utilized due to the reoccurring errors permission denied, referring to $REMOTEPASS calling the remote's password, and no such directory referring to $REMOTEVIDPATH calling the path to the video storage location on the remote.To aid technicians in troubleshooting, echo components were added after each code step in recordVideo.shto better locate coding errors.Lastly, to accommodate different experimental designs occurring simultaneously the recordVideo.shfile was saved as two separate codes with their respective video paths.A tutorial on how to install and configure the updated PiRATeMC system can be found on GitHub (https://github.com/George-LabX/raspicluster/blob/main/README.md).In terms of equipment, additions were the Arducam fisheye lens, Raspberry Pi PoE HAT+, Seagate 5TB external hard drives, and a locally manufactured acrylic device for secure camera installation to a modular chamber.

Operant chambers
The RPis were housed in an ENV-007CT Med Associates modular operant chambers (53.34 cm×34.93cm×27.31cm) designed for rodents equipped with an ENV-005 grid floor (28.6 cm×24.1 cm).A comprehensive inventory of the individual products utilized in this design are provided in Table 1.below including those in the PiRATeMC paradigm.

Statistical analyses
Data on system video size was gathered by averaging 60 videos from a 1-hour mock session at the default parameters of 30 frames per second (FPS), 10 M bitrate, and 800×600 resolution.To acquire data on throughput, fidelity, synchronicity, and functionality of the system, we ran four separate one and two hour sessions in mock conditions with 15 boxes for 2 hours (2 h-15 Boxes), ten boxes for 2 hours (2 h-10 Boxes), ten boxes for 1 hour (1 h-10 Boxes), and five boxes for 1 hour (1 h-5 Boxes) for scalable comparisons (Fig. 1.).A separate 24 hour session with 15 boxes (24 h-15 Boxes) was conducted to test stability of the RPis and their capacity to facilitate long-form experiments such as laboratories engaging in sleep analysis studies, and subsequent analyses of fidelity.Results of tests were analyzed using R Studio (R version 4.3.1 (2023-06-16 ucrt)).Preliminary analyses of QQ Plots, Shapiro-Wilk tests, and Levene's tests were conducted to ensure the proper statistical comparisons were made.

Throughput
Throughput was analyzed by documenting the echo notations when the conversion to MP4 began ("Converting to mp4"), when the transfer to the remote began ("Transferring…"), and when the transfer was complete ("mp4 transfer complete"), modified in the PiRATeMC recordVideo.shrecording file for the four-session experiment.Conversion time (convert_secs) was defined as the time between the "Converting…" timestamp and the "Transferring…" timestamp, and transfer time (trans_secs) defined as the time between the "Transferring…" timestamp and the "…complete" timestamp.Medians, and standard error of each were subsequently computed as well as the full processing time (con-vert_secs + trans_secs = process_secs) from each RPi to remote PC (Fig. 1.b.).

Fidelity
To investigate fidelity (Fig. 1.c.), we modified recordVideo.shnot to delete the original H.264 file and to transfer it with the MP4 file to the remote PC.The H.264 (reference) video file and the distorted (product) MP4 video were analyzed utilizing Windows PowerShell (1 h and 2 h sessions) or Bash (24 h session) scripts and FFmpeg functions to compute three full reference metrics (FR): Structural Similarity Index Measure (SSIM), Peak Signal-to-Noise Ratio (PSNR), and the Video Multimethod Assessment Fusion (VMAF).SSIM is a measure of the elements luminance, contrast, and structure that are relatively independent of one another and combined into a single score between − 1 and 1 (Wang et al., 2004;Venkataramanan et al., 2021).PSNR is expressed as a decibel (dB) logarithm of the mean squared error (MSE) (Deshpande et al., 2018;Setiadi, 2021) of luma (Y) and chromatic (U, V) elements.The VMAF recently developed by Netflix is a combination of modified versions of the quality metrics Visual Information Fidelity (VIF) and Detail Loss Metric (DLM) in combination with a metric of motion computed from the luminance element (Li et al., 2016) and is found to be highly correlated with the human visual system (HSV).SSIM, PSNR, and VMAF metrics of the system recordings were computed by exporting the results of each H.264 file and its corresponding MP4 file before computing the median and standard error of all videos.
Separately, we computed the Inter-Frame Interval Variability (IFIV) and subsequent Jitter, defined as "repeated or dropped frames" (Huynh-Thu and Ghanbari, 2006), for frames per second inconsistency.The IFIV was computed by exporting frame timestamps from each MP4 file with the FFmpeg function FFprobe, which prints multimedia elements into readable mediums, and calculating the time between each timestamp in seconds.To consider the discrepancy in the number of observed frames (i.e. the number of frames exported and recording length) and the number of expected frames (i.e. the number of frames expected given the session length, e.g. one hour = 108000 frames) the IFIV was then calculated as follows: IFIV = (expected frames / observed frames) / (30 FPS) Jitter was then computed as a +/-percentage of this discrepancy by subtracting one from the expected frames divided by observed frames.
Jitter = 1 − (expected frames / observed frames) A secondary measure for dropped frames that only considers the observed number of frames was computed by separately exporting the length of the session in seconds, converting the length to frames, and comparing the difference with the observed number of frames as follows: dropped frames = (session seconds / (30 FPS)) − observed frames 2. 2.3. Synchronicity Synchronicity (Fig. 1.d.) was determined by beginning a recording with all operant chambers closed and running a Med Associates program to initiate both cue lights for two seconds and determining for each RPi the moment the cue lights turned on and whether this was timestamped differently across all Pis.This was conducted by extracting the first 10 seconds in each video as an individual image and utilizing the SSIM analysis function in FFmpeg.Using a dark reference image and comparing it with all subsequently exported images at the peak of the light event enables the identification of an inverse relationship between luminance (Y) and the event, with the lowest score representing the light event.Each RPi frame representing the light event was subsequently divided by the FPS (30) to get a time in seconds the event occurred.The mean and standard error was taken of all light frames.

Categories and variables
Recordings were compared by Time (1 h and 2 h) and by RPi Model (3B+ and 4B) to determine potential differences in length of session time and two of the most recent available Raspberry Pi models.Table 2. below displays the variables compared across categories complete with name, definition, and operationalization.
Normality and homogeneity of variances were tested using QQ Plots, Shapiro-Wilk tests, and Levene's tests.Due to failures of these tests, comparisons between RPi models for all variables were computed with Wilcoxon-Mann Whitney tests except variables IFIV and Jitter were computed with Student's t tests.Comparisons by Time for throughput variables of conversion seconds, transfer seconds, and processing seconds between the 1 h and 2 h sessions were computed by Wilcoxon-Mann Whitney tests, while the variables SSIM, PSNR, VMAF, IFIV, and Jitter between the 1 h, 2 h, and 24 h sessions were compared by Kruskal-Wallis tests and post-hoc Dunn's tests with Bonferroni adjustments.The 24 h sessions were not included in the throughput comparisons due to conversion from H264 to MP4 requiring twice the file size resulting in more space required than a 16 GB SD card as used in this design (~8.3 GB per 24 h video).The H264 files were transferred and converted on the remote PC before undergoing tests of fidelity.The variables light frame and light seconds for all comparisons between groups were conducted by Levene's test of variance given the recordings were conducted on separate computers and therefore tests of internal variability were preferred to determine synchronicity.

Raspberry Pi and camera assembly and installation
Fig. 2. displays the parts necessary to assemble the system as instructed by Centanni and Smith (2023) with additions of the Arducam wide-angle fisheye lens ((175 ) as the original NoIR v2 camera did not allow visibility of the chamber levers when secured to the top pane of glass (Fig. 3.).The RPis were then positioned on the inside wall of the outer sound attenuated box using two standard M3 x 1" self-tapping screws on the right wall at a distance corresponding to the width of the RPi circuit board.The RPis were secured in place with loose zip-ties to the screws by using the PoE+ HAT posts, allowing for easy removal or placement of the RPi and access to the micro-SD cards as needed (Fig. 7.a).This arrangement ensured that the RPis were in a location with sufficient airflow for cooling and allowed unobstructed access for the camera cable to reach its designated operating position, while also remaining close to the hole in the side of the box for the ethernet cable.
To provide consistent recordings, the cameras need to be positioned in a location that has full visibility into the operant chamber without being obscured by the outer glass, not obstruct the catheter line into the chamber or the technician from installing the subject, nor be moved at any point during the experiment.To address these requirements, the cameras were secured to a device laser cut from acrylic designed specifically for this configuration called a BoxTop with sections removed for the catheter line and camera installation.This was achieved with two M3 x 8 mm self-tapping screws through both the camera board and device.The top of the ENV-007CT modular chamber had a 5 cm (50 mm) diameter hole located at the center of the plexiglass or metal material, the BoxTop measures 70 mm x 70 mm x 4 mm and was secured to the top of the operant chamber by two M3 x 8 mm self-tapping screws at diametric corners (Fig. 4. and Fig. 5.).The lens measured approximately 28 cm from the grid floor after installation.

Ethernet switch
For multiple RPis to communicate over Ethernet, a centralized integration hub is needed.The Cisco 3650 Catalyst 24-port switch provides connection as well as power over Ethernet to the RPis thereby eliminating a power cable as well as simplicity in configuration with the remote controller.Approximately five minutes after being powered on the switch should be fully booted and ready to plug-and-play.This ethernet switch does not require internet connection and enables clustering and simultaneous communication with up to 23 RPis, while models are available with up to 48-ports, as well as methods to stack switches to accommodate ever increasing numbers of RPis on one network.

Remote controller
Now that a device for recording and a hub of integration is established, an instrument to dispatch commands in a synchronized fashion and receive the videos produced is necessary to function as a remote controller.The latest version of Ubuntu desktop is suggested to be installed to ensure seamless interaction with the Raspbian Linux operating system present on the RPis.This is carried out with a USB drive containing at least 4 GB of storage and creating a bootable Ubuntu image.A tutorial of the process can be found at https://ubuntu.com/tutorials/create-a-usb-stick-on-windows#1-overview.The AWOW mini-PC (and many other brands or PC options) can be found in multiple storage and RAM sizes to accommodate any individual preferences, while Fig. 6. shows the system information of those used in this design.

Video path and storage
Given the Linux operating system currently lacks the "online-only" feature of DropBox, each AWOW was fitted with a Seagate 5TB external hard drive in one of its two 3.0 USB ports for storage of recordings before being uploaded to DropBox via an internet browser.

Infrared installation
The infrared lights were installed approximately 1-2" from the ceiling on the back wall of the outer sound attenuating box (Fig. 7.b. and 7.c.), pointing directly into the chamber or downwards against the wall.The reflected light provides sufficient visibility for recordings or images.In the ENV-018 V cubicles, which have a greater height clearance, lights were installed on the back right corner of the ENV-007CT modular operant chamber itself.

Additions and remarks to recordVideo.sh
Recording multiple experiments containing different variables or identifiers can be challenging with a single recording file as it requires modifying the code on each Pi prior to the experiment.Furthermore, having a single location for all video files may become difficult to parse between the different experiments one may have running at any given time.To accommodate this, separate video code files can be created for each variable or experiment identifier complete with their own respective video path and specifications to avoid the need to change the recording file for every different experiment that uses a different variable as each element of the recording file such as session length, image to be in color or black and white, location of where files are to be transferred, bitrates, frames per second, or resolution are all customizable to fit one's experimental design.This can be accomplished by editing the recordVideo.shfile by executing "nano recordVideo.sh"and changing the desired path and parameters, then saving it with a unique name for identification and execution.
For example, reccoc.sh("record cocaine", Fig. 8.) is one created from the recordVideo.shfile complete with its respective path to its folder on the Seagate hard drive (Expansion).Once created, the file is then made executable with "chmod +x", meaning, "change mode of file to allow execution", in which case, chmod +x reccoc.sh, in this example.

Execution of recordVideo.sh
Once materials are installed, programming connections and interfaces effectively established, and recording parameters are set, execution of the recordVideo.shfile by running "./recordVideo.sh<video_name> <length (mins)> <FPS>" begins the recording function on all RPis connected to a switch simultaneously (Fig. 9.).

System ease of use and maintenance
This system can be implemented into any laboratory and fully installed with no coding experience or need for a specialist in a month's time or faster depending on a laboratory's number of chambers or RPis desired by a single technician as outlined in the tutorial on GitHub and summed above by executing a few lines of code and assembling the necessary products.Editing of the recording files accomplished by "nano <file_name>" and adjusting the parameters (e.g.bitrate, resolution, etc.), as well as executing said file with a line of code to begin recordings and acquire video data provides the ability for any researcher to devise and carryout an experiment of choice with no hands-on coding experience.
Furthermore, given the RPis are headless (no monitor connected) and not connected to the institutional network, and the OS is preloaded with the necessary files as described, maintenance is minimal as the RPis themselves will not need to be regularly updated.The remote PC when prompted is updated similarly to any PC or laptop, though this is not required to carry out recordings and storage.When executing the recording file, a major advantage of the RPi and RaspiCam recording  functions used is the explicit notice when an error occurs and what the error is referring to.Most common being "MMAL" errors, referring to the multi-media abstraction layer and creating the "camera component" typically in relation to a ribbon disconnection, or the "Sunny" connection, which is the small front-facing component connecting the camera to the circuit board, each of which can arise as chambers are moved around during a given experiment.In total, maintenance of the system regresses to routinely erasing old recordings off the RPis to prevent loss of data as recordings still execute when storage is full but then do not save, cleaning and refocusing lenses when needed, and protecting the exposed circuit boards of the cameras from environmental hazards such as drug from the catheter line or dust and dander from the chambers in whichever way a laboratory seems fit (i.e.waterproof tape), maintaining sufficient supplies to replace any necessary parts, and addressing these troubleshoots as they arise.

Video size
Only two parameters in the recordVideo.shfile of the PiRATeMC paradigm affect the resulting video size: bitrate (set at 10 M, raspivid defaults at 17 M) and resolution (set at 800x600).When executing the recordVideo.shfile set at 30 frames per second, a bit rate of 10 M, and 800x600 resolution, then summing all files and dividing by N=60, the average one-hour video is approximately ~350MB per RPi.For the current system with 60 RPis that results in about ~21 GB per hour, ~127 GB for a 6-hour self-administration session, and ~254 GB for a 12hour session.The same method of averaging video size found that the average 24 h video was ~8.3 GB.
If one were to establish a single AWOW and 24-port Catalyst design with the maximum 23 RPis, that equates to ~8.13 GB total of video every hour.Considering the micro-SD cards on the RPis are only 16 GB (larger ones are available) and the average hour of video is ~350MB the micro-SD cards will need to be cleared at most every ~45 hours of video, while a 256 GB model AWOW connected to 23 RPis will need to every ~31 hours (not accounting for storage allocated to OS systems on each device).

Group comparisons
All descriptive statistics for comparisons by Time and by RPi models are represented in Table 3. below.For the variables conversion seconds, transfer seconds, processing seconds, SSIM, PSNR, and VMAF the values reported for both Time and RPi models are medians and standard error of the median, while IFIV and Jitter for both are means and SEM.Fidelity comparisons of the three FR metrics by Time between ses-1 h, 2 h, and 24 h as revealed by Kruskal-Wallis tests found no significant differences in SSIM (H (2) = 3.32, p =.19) and VMAF (H (2) = 2.24, p =.326), but a significant difference was found in PSNR (H (2) = 3.38, p = <.001),specifically between the 1 h and 24 h (z = 3.98, p = <.001) and 2 h and 24 h (z = 5.74, p = <.001) but not 1 h and 2 h (z = − 1.29, p =.195) as shown by post-hoc tests (Fig. 11.).

Synchronicity
Fig. 16.below illustrates the statistical rationale to determine the light event frame and therefore time in seconds that the light event occurred as a measure of synchronicity across RPis in a given session.The full reference metric SSIM computes three separate metrics as a single score from − 1-1, determining visual quality similarity between a reference image or video and a distorted (product) image or video, one of which is the luminance (Y) within an image.Therefore, if a dark image is used as the reference image of comparison, the frames, and thereby the particular frame in question, that the cue lights occur will have lower luminance scores given the dissimilarity in visual brightness.Fig. 16.shows the results of one chamber's SSIM scores across the full set of frames tested (16.a.) and a scaled view of the frames in which the cue lights were on (16.b.).
Once each frame was determined within a given recording, Levene's tests of differences in variance by Time (F(1,38) = 0.03, p =.862) and by RPi models (F(1,13) = 1.36, p =.264), determined no significant differences in variances.For all recordings in the system (N=40), the mean light frame was 180 ± 0.7 or 6 ± 0.02 seconds.

Scalability
Commercial manufacturers such as Med Associates can only connect to a maximum of eight recording devices per computer given software constraints, but only can run four cameras at the maximum resolution (640 x 480) at 15fps, or only two at 30fps, significantly limiting throughput capabilities and scalability as described in the Video Monitor's User Manual.In contrast, many previous Pi-based studies have exhibited its scalability with 1-16 synchronized RPis in USB-type setups (Mathis et al., 2018;Nath et al., 2019;Hou and Glover, 2022), WiFi connections (Singh et al., 2019;Weber andFisher, 2019, preprint, doi:10.1101/596106;Marcus et al., 2022), or through Ethernet (Saxena et al., 2018;Centanni and Smith, 2023).The rate at which USB-type designs can acquire the necessary data or sample size for analyses is hindered by such low scales.Furthermore, many laboratories are housed in universities and technically secure locations, and a wirelessly connected RPi is a security risk to an institutional network.The use of RPis in conjunction with Ethernet switches allows for orders of magnitude more subjects per experiment.Additionally, the use of an switch removes the need to be directly connected to the institutional network, and subsequently results in the system being able to transmit significantly faster and be insensitive to the technical issues of the institutional network, meaning one can connect to the RPis despite internet failures.

Cost of implementation
Commercial organizations, such as Med Associates, may provide state-of-the-art products, software, and functionality, but are expensive for an institution to implement.For instance, a single Med Associates MONO6 camera with a lens included costs ~$779 (~$46,740 for 60 cameras).Moreover, only eight cameras may be operated by a single computer, and only two computers can use a single software license, resulting in a cost of ~$103,500 to record 60 chambers.Furthermore, the cameras utilized in Mathis et al. (2018) and Nath et al. (2019), the Point Grey made Firefly and Grasshopper3 4.1 MP Mono USB3 Vision (models: FMVU-03MTM-CS and CMOSIS CMV4000-3E12), Basler   infrared-sensitive CMOS camera, Hero5 GoPro, and The Imaging Source made DFK-37BUX287 industrial camera, range from ~$250 (Hero5 GoPro) to ~$1539 (Grasshopper3) for a single camera (Basler infrared-sensitive, 2023; Blackfly, 2023;Hero5, 2023;The Imaging Source, 2023;Point Grey, 2023a, 2023b).Similarly, the Blackfly Mono S camera (model BFS-US-13Y3M-C) utilized in Pereira et al. (2022) are ~ $480 per camera as listed on Teledyne Flir.
The RPi is a frontrunner in addressing such issues of cost and use of space in a laboratory.For a 60-chamber system controlled by four computers, the cost of materials is approximately ~$11,000 (Table 1.), less than the camera products alone from Med Associates, and a fraction of the cost utilized in Mathis et al. (2018), Nath et al. (2019), andPereira et al. (2022).Considering 23 RPis can be present on one Cisco Catalyst 3650 24-port switch, such an arrangement can be done across three computers at ~$10,500, while a single fully stocked AWOW and Catalyst 24-port switch with 23 RPis equates to roughly ~$4000.

Discussion
This report demonstrates the fidelity, customizability, stability, and highly scalable refinement of the affordable and high-throughput PiRATeMC (Pi-based Remote Acquisition Technology for Motion Capture) system with the synchronization of 60 Raspberry Pi video recordings for subsequent pose-estimation analysis.It was found that inclusion of the Arducam wide-angle fisheye lens provided better visibility of the full operant chamber and levers as opposed to the stock camera lens of the NoIR V2.Development of the BoxTop device allows for precise installation, leading to greater stability and visibility across chambers and sessions than previous methods, enabling standardization of camera location and distance to the chamber floor for crosscompatibility of developed models and reproducibility.Remote controllers and ethernet switchboards such as the AWOW mini-PC and Cisco Catalyst provide a localized network of instruments capable of initiating large-scale synchronized connections between devices, and the ease of use and low maintenance of the Raspberry Pi eliminates the need for specialists or extensive training to facilitate a laboratory's experimental needs.
The recording file recordVideo.sh is easily editable to fit the desired parameters of any laboratory, and capable of making multiple copies to fit specific experimental designs that differ in length, color setting, file transfer location, bitrate, FPS, and resolution to name a few.Issues with embedded call variables were overcome by hard coding the password and video path.Successful installation and execution resulted in recording 60 Raspberry Pis in parallel at considerably lower costs than commercial alternatives with no sacrifice to video quality as shown in the results of SSIM, PSNR, VMAF, IFIV, and Jitter tests.Analyses of throughput, fidelity, synchronicity, and comparisons of two of the most recent RPi models (3B+ and 4B) found that laboratories can virtually accomplish continuous data acquisition given the negligible processing time (Mdn = 116 ± 22 seconds) of recordings with quality assurance and synchronized within 0.02 seconds irrespective of RPi model.Successful completion of a 24 h session confirmed the utility of the RPi to conduct longform and extended timeframes for such experimental designs investigating sleep or home cage conditions that require lengths beyond a typical self-administration session.
The inclusion of the Arducam wide-angle fisheye lens provided better visibility of the full operant chamber and levers compared to the original camera lens.Experiments with large areas of operation, similar to that seen in Saxena et al. (2018), may benefit from wide-angle lenses by decreasing the number of cameras needed to capture the full experimental space given the Arducam is an 8MP (4 K) device, consequently lowering costs without losing resolution.
Development of the BoxTop adapter allows for precision in installation, leading to greater stability and visibility across chambers than other methods in previous studies, as well as standardization of camera location and distance from the chamber floor, particularly the use of hot glue in Singh et al. (2019), or Styrofoam holders outside the chambers in Weber and Fisher (2019), preprint, doi:10.(1101)/(5961)06).Similarly, given Centanni and Smith (2023) report work with mice and therefore have smaller Med Associates provided chambers than the rat-sized chambers of this design, placement of the cameras 29 cm from the grid floor results in full visibility of the chamber controls, but video is obscured by the top plane of glass as can be seen in the videos and images from their paper.This method would require regular maintenance to ensure clarity, while a lens such as the Arducam secured to the top pane of plexiglass would capture the full chamber area.Moreover, it is acknowledged that the 3D-created device by Hou and Glover (2022) is versatile and allows an extensive range of angle configurations and experimental designs.The use of 3D print was the original method of production for the BoxTop, but creation of a single device via 3D print was roughly 20 minutes, as opposed to no more than 1-hour to laser cut 150 devices from acrylic, allowing far greater numbers in a shorter time.A readily apparent limitation of the BoxTop is it only facilitates installation of cameras to the top of operant chambers and not for other forms of experiments such as open field or T-maze to name a few.But for operant experiments and subsequent development of pose-estimation models to be used amongst different labs where such details of camera configurations and location can affect performance, to have such a specific chamber relation facilitates cross-compatibility and standardization.
Remote controllers and ethernet switchboards such as the AWOW mini-PC and Cisco Catalyst featured in this system provide a headless network of instruments localized to a focal point capable of initiating a large-scale synchronicity between recording devices uncompromised by network issues.Mini-PCs are compact devices supplied in many hardware configurations to fit the needs and spatial requirements of any laboratory, while the Cisco Catalyst comes in varying numbers of ethernet ports with stackable capabilities leading to orders of magnitude more RPis to a single remote controller limited only by bandwidth.
The recording file recordVideo.sh is easily editable to fit the desired parameters of any laboratory, and capable of making multiple copies to fit specific experimental designs such as presented in this report.This customizability allows ease of execution and individuation of parameters, additional embedded codes or variables, video paths for storage, and inspiration for further coding development.
This system provides smaller video sizes at greater frame rates and comparative resolution than prior models that report such statistics.The video output results of Hou and Glover (2022) were 738.5MB per 29-minutes of video at 480p and 30FPS, Saxena (2019) resulted in 500MB (0.5 GB) per hour of video at 640x480 resolution and 30FPS, while Weber and Fisher (2019), preprint, doi:10.(1101)/(5961)06) experienced 440-500MB per hour during the light cycle and 1.5-2 GB per hour in the dark cycle at 1280x768 15FPS.Comparatively, the present design video output results in only ~350MB per hour at 800x600 resolution and 30FPS.
Descriptive statistics found that any given RPi (N=40) had a median conversion time of 55 ± 12.0 seconds and a median transfer time of 47.5 ± 10.6 seconds, for a total median processing time of 116 ± 22 seconds (~2 minutes), displaying the capabilities of executing multiple sessions in a given day resulting in high video data output.If a lab were to run multiple 1-hour sessions in succession and processing time invariably resulted in the high-end of the range (562 seconds), one could theoretically complete ~20 hours of video data a day.
Tests of video fidelity for the main 40 RPi experiment confirmed the notion that the Raspberry Pi is not only a cost-efficient means of video data acquisition but one that does not sacrifice quality, as shown by the coefficient of variances of 0.19 % (SSIM), 1.11 % (VMAF), and 5.67 % (PSNR), and only one RPi had a single dropped frame.Full reference metrics SSIM (0.99 ± 3e-4), VMAF (94 ± 0.2), and PSNR (40 ± 0.4) revealed high video quality of recordings, with each well within the range of desired scores.Variance analyses confirmed synchronicity as the system obtained a light event frame standard deviation of 4.32 frames (M = 180 ± 1) or 144 ms (M = 6 ± 0.02 seconds).In conjunction with scores of IFIV (0.03 ± 2e-7) and Jitter (0.11 % ± 6e-4), these measures display the accuracy of the RPi and its potential use in timesensitive tasks and systems requiring temporal precision.
Comparisons by Time revealed significant differences and large effects in conversion, transfer time and processing time, showing a scaling effect on length of session as processing time takes 2.7 times longer in a 2 h session than a 1 h session.As reported, 24 h throughput data was not gathered due to the limited storage space of the 16 GB SD card, and therefore it is suggested for laboratories who wish to have experimental conditions with extended timeframes to purchase larger SD card for their RPis, and safe to assume that conversion and transferring times will increase with length of session.Fidelity comparisons revealed the 24 h session was significantly higher in PSNR scores than the 1 h and 2 h sessions, with no differences between the shorter sessions.Despite this difference, research shows SSIM and VMAF metrics are more representative to the human visual system (HVS) and often outperforms PSNR which is also vulnerable to manipulations (Vranješ et al., 2013;Li, et al., 2016;Deshpande, Ragha, and Sharma, 2018;Li et al., 2018;Sara et al., 2019).Similarly, IFIV and Jitter results found the 24 h session was significantly higher than the 1 h and 2 h sessions, with no differences between the shorter sessions, but this difference was on average 8e-7 (IFIV) or 0.003 % (Jitter), and longer session times intuitively allow more opportunities for duplicated or lost frames during encoding.
Comparisons of the latest Raspberry Pi models 3B+ and 4B found significant differences in conversion time and transfer time, but these differences negate one another considering no significance was found in processing time.Similarly, although PSNR scores differ by as much as ~7 dB, SSIM and VMAF scores between models were virtually identical.Furthermore, the significant differences found in IFIV and Jitter equate to roughly a tenth of a percent (0.1 %) of duplicate frames.These differences in fidelity, although significant, have negligible effects on a laboratory's pursuits and regress to personal preference, experimental goals, or market availability of the 3B+ and 4B models.
Lastly, a major advantage of the RPi is its affordability and versatility to suit the needs of any laboratory as a number of configurations can be made with the removal or addition of other items.To fully place the RPi in perspective within the discussion of recording device options, a single fully stocked Raspberry Pi, complete with RPi, camera, fish-eye lens, 18" cable, PoE+ HAT, infrared light and adapter, micro-SD card, and Ethernet cable, cost around ~$150, and with computer and switchboard included is ~$670, considerably less than commercial alternatives.Therefore, the cost of implementation ranges from ~$670 (1 RPi) to ~ $11,000 (60 RPis) as outlined in this report.Furthermore, the Raspberry Pi utilizes the elements present on the Linux/Ubuntu OS to conduct its functions, eliminating the need for new software purchases or installations, and capable with a few lines of code in the terminal to establish an interface between the remote and switch.
Some limitations to note are, although ~0.1 %, the reliable duplication of frames as found by the IFIV and Jitter computations can confound experiments that require extreme millisecond precision that work in conjunction with other methods such as optogenetics or voltammetry.The reliability of these additions facilitates the developments of reducing or statistically considering their effects.Secondly, RPis have been in high demand in recent years, leading to individuals trying to capitalize on such demand by purchasing many products and selling them personally, and as a result some providers such as Adafruit have been known to limit the number of items a single purchaser can acquire in one shipment but can be overcome by distributing purchases across one's laboratory members.This also leads to the varying in price from suppliers of the different individual products.Despite these obstacles, the price and availability pales in comparison to the videography options presently on offer.
In summary, the implementation of the present system provides a laboratory with a cost-efficient and quality assured means for acquiring video data for subsequent analyses comparable to proprietary or recently developed Raspberry Pi methods.For labs conducting selfadministration experiments the throughput capabilities of the Raspberry Pi 3B+ or 4B are attractive as session turnover is miniscule and storage needs are low (~ 350MB/h).Fidelity metric tests provided results desired of a video acquisition system as seen in the more robust SSIM and VMAF findings as opposed to the confoundable PSRN tests, but potential drawbacks for extreme temporal precision laboratories such as voltammetry are those results seen in the IFIV and Jitter, but the reliability of these deviations allows for a developed means to account for such discrepancies.Although the success of the 24 h session displayed the stability of the 4B model RPi and the system, this effect can be most seen for laboratories wishing to conduct experiments analyzing sleep or home cage activity as session times become longer.

Conclusion
Observing rodent behavior has been a fundamental method in neuroscience.Traditionally, this approach has faced challenges such as reliance on predefined constructs, low inter-rater reliability of these constructs, and experimenter fatigue and drift in documenting observations.The field of computer science and in particular machine vision and pose-estimation has shown itself to robustly bear these limitations and discover patterns of behavior invisible to experiments, leading to deeper discoveries into the etiology of manifestations and how these may translate to interventions and prevention in humans.These models of observation allow cross-compatibility within the programs themselves as well as between laboratories, thus leading to more continuity of language in definitions.This raises a need for affordable, customizable, and scalable means to acquire quality assured video data.
In this report, the successful implementation and expansion of the PiRATeMC (Pi-based Remote Acquisition Technology for Motion Capture) system with the synchronization of 60 Raspberry Pi video recordings for subsequent pose-estimation analysis demonstrates a capable and effective option to address this rising demand.A laboratory can be fully installed with no coding experience or expertise in a month's time by a single technician by executing a few lines of code as outlined in the tutorial on GitHub, providing the ability to record a large number of subjects in a variety of timeframes, species, and settings without the need of a specialist, resulting in a high throughput and fidelity of video data with low storage needs and financial burdens.The low maintenance of the system provides further reassurance of meeting experimental needs and outcomes by routinely erasing old recordings, cleaning and focusing lenses, and protecting the exposed circuit boards of the cameras from environmental hazards in whichever way a laboratory seems fit.Limitations of the system, such as frame duplication, should be taken into consideration upon implementation.Ultimately, the Raspberry Pi's versatility, affordability, and ease of use makes it an attractive option for various laboratory needs and designs at a competitive price compared to commercially available proprietary methods.acquisition, Data curation, Conceptualization.

Declaration of Generative AI and AI-assisted technologies in the writing process
Statement: During the preparation of this work the author(s) used ChatGPT-4 in order to accelerate coding for FFmpeg analyses.After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Declaration of Competing Interest
None

Fig. 1 .
Fig. 1.Workflow.Visualization of data acquisition and statistical analyses of the five sessions compared by throughput, fidelity, and synchronicity.

Fig. 2 .
Fig. 2. Raspberry Pi Assembly.a. Visualization of each individual Raspberry Pi product purchased, assembled, and utilized in this experiment.b.Unboxed Model 4B RPi with ports towards user.c.Fisheye lens (left) replacement for stock lens (right).d.Camera tabs to replace and install ribbon cable.

Fig. 4 .
Fig. 4. BoxTop.Visualization of the BoxTop schematic complete with measurements and display of placement over 50 mm diameter operant chamber hole (large red center circle).

Fig. 5 .
Fig. 5. Visualization of the BoxTop installed inside an operant chamber.a. Subject's eye-view from inside modular chamber.b.Technician view from top of chamber.

Fig. 7 .
Fig. 7. Visualization of a fully assembled Raspberry Pi and infrared light in an ENV-022SA Med Associate SAC.a. Raspberry Pi secured to inside wall by two M3 x 1" self-tapping screw and zip-ties.b.Inside view of infrared light secured to back wall of SAC by one M3 x 1" self-tapping screw pointing into chamber.c.Side view of infrared light secured to back wall of SAC.

Fig. 8 .
Fig. 8. Recording Executable File.Example of the reccoc.shexecutable file to record and transfer videos to the remote.

Fig. 16 .
Fig. 16.Light Event Identification.Line graphs showing Cam28 from session 2 h-15 Boxes and each frame's SSIM score to discover light event by inverse luminance (Y) score.a. full set of individual frames were compared with reference frame 161, having a 1.0 after compared with itself, through to frame 250.b.Scaled to the frames in which the cue lights were on displaying the frame experiencing the brightest point (181) at the lowest Y value, and the increase in Y scores as the camera adjusts to the cue lights before frame 241 where cue lights turned off.

Table 1
All products utilized including vendor, individual price, price within a 60 RPi design, and usage or variation notes, and a total cost.
Secures Raspberry Pi to inside wall of the Med Associates sound attenuating chamber (SAC).(120)total.

Table 2
Table of variables used for statistical analyses.
light_frame Light Frame Frame number determined by FFmpeg as lowest inverse of SSIM luminance (Y) value light_secs Light Frame in Seconds Light frame number divided by 30 fps

Table 3
Summary of Statistical Comparisons by Time and Raspberry Pi Models.
Statistics reported are medians and standard error of the median, while italicized statistics are means and SEM.Significance: * p <.05, ** p <.01, *** p <.001