Conducting Molecular Epidemiological Research in the Age of HIPAA: A Multi-Institutional Case-Control Study of Breast Cancer in African-American and European-American Women

Breast cancer in African-American (AA) women occurs at an earlier age than in European-American (EA) women and is more likely to have aggressive features associated with poorer prognosis, such as high-grade and negative estrogen receptor (ER) status. The mechanisms underlying these differences are unknown. To address this, we conducted a case-control study to evaluate risk factors for high-grade ER- disease in both AA and EA women. With the onset of the Health Insurance Portability and Accountability Act of 1996, creative measures were needed to adapt case ascertainment and contact procedures to this new environment of patient privacy. In this paper, we report on our approach to establishing a multicenter study of breast cancer in New York and New Jersey, provide preliminary distributions of demographic and pathologic characteristics among case and control participants by race, and contrast participation rates by approaches to case ascertainment, with discussion of strengths and weaknesses.


Rationale for the Study.
Although breast cancer incidence is higher overall in women of European descent than in women of African ancestry, African-American (AA) women are more likely than European-American (EA) women to be diagnosed before age 40 and to have breast tumors with more aggressive features, including high-grade and negative estrogen receptor (ER) status (reviewed in [1]). There are no facile explanations for these differences in the epidemiology of breast cancer by ancestry. There have been several studies of breast cancer risk that include both AA and EA women, such as the Carolina Breast Cancer Study, the CARE Study, and the Black Women's Health Study; however, none were specifically designed and powered to evaluate numerous risk factors for early/aggressive breast cancer and to evaluate the distribution of these risk factors within and across racial/ethnic groups. Because of the large, racially mixed population of women in metropolitan New York City (NYC) and eastern New Jersey (NJ), we are currently conducting a case-control study, the Women's Circle of Health Study (WCHS), with the goal of accruing 1200 AA and 1200 EA women with breast cancer and an equal number of controls, to specifically address these questions. Initial funding for this study was through a Center of Excellence for Biobehavioral Breast Cancer Research (Bovbjerg, PI) focusing on AA women, funded by the Department of Defense (DOD). Additional R01 funding (Ambrosone, PI) from the National Cancer Institute (NCI) was subsequently obtained which allowed us to increase the sample size and to extend the study to EA women. Additional facets of the study are funded by the Breast Cancer Research Foundation.

Materials and Methods
As illustrated in Figure 1, the study has included two bases for recruitment and interviewing, one in NYC, based at Mount Sinai School of Medicine (MSSM), and one in NJ, based at The Cancer Institute of New Jersey (CINJ), with data and biospecimens sent to Roswell Park Cancer Institute (RPCI) in Buffalo, NY, for processing and storage. In the NYC metropolitan region, there are more than 60 hospitals where surgery for breast cancer is performed. When this study began in 2003, to maximize efficiency, we targeted the hospitals that had the greatest referral patterns for AA women in the boroughs of Manhattan, Brooklyn, Queens, and the Bronx. Our initial plan was to employ the approach commonly used in case-control studies, such as the Carolina Breast Cancer Study [2] and the Long Island Breast Cancer Study Project [3], wherein rapid case ascertainment is used to identify women newly diagnosed with breast cancer through periodic review of pathology reports in the targeted hospitals. When women with breast cancer are identified, a letter is sent to the treating physician, notifying them that unless they object, the patient will be contacted to describe the study and assess interest in participation.
We were unable to use this approach, however, due to the implementation of the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule in 2003, while we were establishing the infrastructure for the study. This extension of the HIPAA regulation prevents the release of private health information (PHI) without consent from the patient. For our research purposes, this Act prevented the identification of eligible cases without the patients' prior permission given to their doctors. Although there may be situations in which an HIPAA waiver can be obtained to circumvent the need to obtain patient permission for release of identifying information to researchers [4,5], the several participating hospitals and their Institutional Review Boards (IRB), many not extensively familiar with epidemiological research, would not grant these waivers to allow patient identification. Thus, we developed a procedure for patient ascertainment and contact that complied with the regulations of HIPAA.
As an alternative strategy, we expanded our catchment area to include eastern NJ, by partnering with CINJ and the NJ State Cancer Registry, a Surveillance, Epidemiology and End Results Program (SEER) site, housed at the NJ State Department of Health and Senior Services (NJDHSS). The study has been approved by the IRB at RPCI, Robert Wood Johnson Medical School (for The CINJ), MSSM, the individual hospitals in NYC, and the NJDHSS.
In this paper we report on both of our approaches to case ascertainment and consenting, discussing effort and costs associated with each methodology. Currently, recruitment efforts are focused only in NJ, and accrual has been discontinued in NY. We also present an overview of the study design, report on distributions of demographic and selected breast cancer risk factors among both cases and controls by race/ethnicity, and compare clinical breast cancer characteristics between groups in a subset of the population enrolled to date.

Hospital-Based Case Ascertainment and Contact: New
York City. AA and EA women, 20 to 65 years of age, with no previous history of cancer other than nonmelanoma skin cancer, diagnosed within 9 months with primary, histologically confirmed invasive breast cancer or ductal carcinoma in situ who speak English were eligible for participation in the study. They were ascertained from designated hospitals that have large referral patterns for AA women in the NYC boroughs (Manhattan, Bronx, Brooklyn, and Queens; due to few AA breast cancer patients, Staten Island was not included). To maintain comparability between cases and controls, women with breast cancer must have had a residential telephone given that controls were ascertained  using random digit dialing (RDD). This eligibility criterion has now been expanded to cell phone usage, however, with RDD also covering cell phones for control ascertainment.
To address HIPAA regulations that prohibit identification of women with breast cancer using pathology reports, tumor registry data, or medical records, we worked to develop collaborative relationships with physicians, research nurses, and patient navigators at each of the participating hospitals. Our research assistants initiated frequent visits to each site, particularly on clinic days, and became well known by staff and clinic personnel. As we began working with physicians at each site, clinicians reviewed their records for retrospective ascertainment and identified women who were eligible to be in the study (e.g., had been diagnosed within the last 9 months). At each of the participating hospitals, physicians telephoned women who were not returning for followup and would not be seen at subsequent visits, asking if WCHS staff could contact them regarding the study. Those scheduled for routine followup appointments within the 9-month interval were seen and asked if they were willing to be contacted for this study. For contemporaneous recruitment, our study staff was present in the offices on breast clinic days and was informed by the physicians or research nurses at that time of patients scheduled on those days who were eligible for the study. Study materials were placed in the charts of the eligible patients as a reminder for the clinician to discuss the study. If in agreement, the patient was then referred to our waiting study staff. A number of patients participated in the informed consent procedures at the time that they were first approached and a pretreatment blood specimen was obtained. Other women preferred to be contacted at a later date by the Research Assistant (RA)/Study Interviewer, to schedule a date to obtain consent and conduct the inperson interview.
To strive for complete case ascertainment, we periodically requested that physicians review their records to confirm that we had not missed potential cases, and that they follow the procedures described above if there were women who were not previously approached to participate in the study.
It was our intent that this periodic review would allow us to estimate a denominator, to some extent, and to keep track of women who refused to be contacted so that selection bias could be examined. However, these data were not easily obtained with our inability to access records of women diagnosed who had not been approached, and competing priorities of busy surgeons.
This approach to case ascertainment and contact yielded good participation rates for both AA and EA cases but was extremely labor intensive, requiring frequent communications between our research staff and clinical personnel as well as the presence of RAs at the hospitals on clinic days. Besides being costly in personnel time, this methodology required a good deal of dedication and commitment on the part of physicians, with frequent reminders from study staff for them to check their appointment ledgers and contact patients who may have been missed on clinic days. Because of all of the limitations of this approach, in 2006 we established collaboration with the New Jersey State Cancer Registry, based at the NJDHSS for rapid case ascertainment, and phased out recruitment in metropolitan New York, ending in December 2008.

Population-Based Case Ascertainment and Contact: New
Jersey. In NJ, cases are actively being identified at all major hospitals in Passaic, Bergen, Hudson, Essex, Union, Middlesex, and Mercer Counties through rapid case ascertainment. In addition, NJDHSS study staff routinely check the New Jersey State Cancer Registry (NJSCR) database for eligible cases who reside in the target counties but are reported by hospitals outside of those seven counties or out-ofstate. All AA women less than 65 years of age who are newly diagnosed with incident breast cancer are identified as potential participants. For each AA case, an EA woman with breast cancer is randomly selected, matching on age (±5 years) and county of residence. NJDHSS study staff review pathology reports of potential cases, contact doctors' offices, and hospitals to verify patients' race and demographics and check the NJSCR database for prior diagnoses of cancer. After contact with clinicians by NJDHSS staff for passive consent (e.g., contact from physician only in the event that they do not give permission to contact their patients), eligible women are telephoned by NJDHSS staff to obtain verbal consent to release names and contact information to WCHS research staff at CINJ. Patients who agree to be contacted by WCHS study staff are then telephoned by one of our interviewers, and appointments are scheduled for in-person interviews at home or at another mutually convenient location.

Control Eligibility and Identification: New York City and
New Jersey. AA and EA women 20 to 65 years of age without a history of any cancer diagnosis other than non-melanoma skin cancer are eligible to be controls. The choice of a proper control group is a difficult issue in epidemiology today, particularly for a study that is not populationbased. When planning for the WCHS, we evaluated several potential sources of control groups, weighing the strengths and weaknesses for each. While we considered using hospital controls in NYC, we felt that they would not necessarily represent the same populations from which the cases were derived. For example, many of the treating physicians at MSSM have private surgical practices; there is no indication that clinic patients from the hospital would be similar to those being treated by private physicians. Furthermore, there are well-recognized potential biases associated with the use of hospital controls [6]. In theory, the generalizability of study results is likely to be greater in studies using community controls rather than those using friend or hospital controls. Yet, in contrast to the Western European national health care records, none of the available United States (US) lists, such as that of licensed drivers, municipal tax roles, voter registration, and listed phone numbers, provide complete source population enumeration. Population coverage, access to this information, and the quality of contact information vary geographically in the US. Of NYC residents, it is estimated that only 52.1% have drivers licenses [7], only 30.2% pay residential taxes [8], and only 56.2% are registered voters [9]. These examples typify the acknowledged weaknesses of US and NYC sampling frames.
For generating a control group of adults under 65 years of age we used random digit dialing (RDD) because unlisted numbers can be reached by this method, thereby avoiding possible selection bias (NYC study found that 27% of RDD controls had unlisted numbers [10]). Thus, RDD provides an ideal source when phone coverage is near complete; 93% of NYC residences have phones [11]. High phone coverage makes RDD one of the best sources for generating a sampling frame for controls of NYC area women under 65 years of age. Even when the source population is not solely defined by geography, a modified version of RDD is available that creates a control sampling frame using the cases' telephone numbers [10,12]. This is the approach that was used in the WCHS in NY. RDD controls have been compared to a privately conducted census population [13] as well as to area survey controls [14], and both comparisons found that RDD controls were similar to those from other sources. Most importantly, high response rates within a minority community were demonstrated using the modified Waksberg RDD method [15], and in the WCHS, response rates among minorities are similar to those among EA women. The elimination of household landline phones in favor of cell phones represents a challenge for telephone surveys based on RDD to landline telephones [16,17]. However, because the percentage of households without landlines remains low [17], any potential bias associated with this issue is likely to be small. Furthermore, once subjects agree to participate in the study, cell phones tend to facilitate scheduling interviews and completing study materials because the calls go directly to the participants and are not screened by other household members.
For RDD in NYC, the telephone exchanges (area code plus three-digit prefixes) of the breast cancer cases who received medical care at the participating hospitals in previous years were used for sampling. We frequency matched controls to cases on the expected breast cancer case distribution (based on 1994-1998 data from the NYS Tumor Registry) by 5-year age groups and race. The age distribution of targeted controls was periodically modified based upon the actual distributions of age among the cases. Controls were identified, recruited, and interviewed in the same manner and during the same time period as the cases to eliminate any bias related to secular trends or changes over the interviewing period.
In NJ, the same methodology is used for ascertainment of eligible controls; however, rather than using telephone numbers from participating hospitals, the entire county is sampled, because cases include those from all hospitals in the seven targeted counties. Controls, once identified, are contacted to schedule an in-person interview; interviews are conducted either at the participant's home or at another convenient location.
For both cases and controls in NYC and NJ who decline participation, we request that they complete a short telephone interview (5-10 minutes) to obtain basic information on demographic and exposure factors. In the final analysis, data from women who refused study participation will be compared to data from women who completed an interview to evaluate potential bias related to non-participation. Women who complete the study are offered a $50 gift certificate to one of several local stores as incentive for participation. We had initially offered $25 at the beginning of the study, but later increased the amount due to inflation and efforts to increase participation.

Data Collection-Interviews and Specimen Collection.
The in-person interview consists of the informed consent process, an in-depth in-person interview, completion of several behavioral questionnaires including a Food Frequency Questionnaire (FFQ), collection of biospecimens, and body measurements. For cases, we also request a release for access to medical records, pathology data and for tumor tissue, as well as permission to conduct followup.
The survey instrument is an adaptation of several questionnaires, including validated surveys from the Women's Health Initiative and the Western New York Diet Study. Developmental history questions were taken from the Women's Interview Study of Health (WISH) [18], and lifetime physical activity is assessed using a modified version of Friedenreich's validated questionnaire [19]. Information on medical history, family history of cancer, lifestyle factors including smoking, alcohol consumption, and use of hair products is also collected. The most recent version of the FFQ developed at Fred Hutchinson Cancer Center and validated in the NCI/SWOG Prostate Cancer Prevention Trial is used for dietary assessment. This FFQ has been validated for use in an AA population. At the end of the visit, detailed measurements of current body size are taken. Participants are asked to wear light clothing, as weight, standing height, and waist, and hip circumferences are measured. Body composition (lean and fat mass) is measured using a bioelectrical impedance analysis scale (Tanita scale). Questionnaires are coded by two separate RAs, and double data entry is performed by two separate clerks, with data managed at RPCI.
Interviews take approximately 2 hours to complete, including anthropometry measures. We initially collected blood samples which were processed and stored in the laboratory at MSSM. In 2007, to reduce costs and to facilitate participation, we transitioned to collection of saliva using Oragene Kits (DNA Genotek, Inc, Ottawa, ON, Canada) for DNA extraction. These collection kits yield large quantities of high-quality DNA, comparable to that obtained from whole blood [20,21].
Periodically, DNA has been extracted in batches, using the DNA Genotek Inc. protocol for DNA extraction from saliva or the FlexiGene method (Qiagen Inc, Valencia, CA) for whole blood or buffy coat. DNA is evaluated for purity and concentration using a Nanodrop UV spectrophotometer to obtain A230, A260, and A280 readings, and double stranded DNA is quantitated using a PicoGreen-based fluorometric assay (Molecular Probes, Invitrogen Inc, Carlsbad, CA). Saliva specimens have been stored at room temperature until extraction, and DNA samples are stored at −80 C at RPCI.

Collection of Tumor Tissue Blocks and Clinical Data.
Formalin-fixed paraffin-embedded blocks and corresponding pathology reports from patients who signed the pathology and tissue release have been retrieved from hospitals on an ongoing basis. To date, 1193 patients have agreed for release of their tumor tissue (91%), and this proportion does not vary between NJ and NY. Pathology reports are reviewed in order to identify a representative tumor block used to make the primary breast cancer diagnosis for each case. The tumor blocks are shipped to RPCI, where they are labeled and entered into the tracking database. Hematoxylin and eosin (H&E) slides are cut and reviewed by the study pathologist (HH) to determine the locations from which cores should be taken for construction of tissue microarrays (TMAs), taking punches from both tumor and normal tissues and for consistent determination of grade by one pathologist. Representative tumor tissue is also labeled and punches taken to be stored for future DNA extraction and analysis. Pathology departments that do not release blocks have instead been asked to process and cut the requested number of slides (eleven unstained 5 µ slides and six unstained 10 µ slides), which are then sent to the laboratory at RPCI. Tissue blocks and pathology reports are collected in tandem and include the abstraction of medical record data. Because the consent process includes a tissue block and medical record release form, and blocks are being requested in "real time", there has been little resistance on the part of the hospitals to provide tissue.

Challenges and Adaptations to Meet Them.
In establishing the infrastructure for this study, and making efforts to conduct a study based in community hospitals in the face of stringent HIPAA and confidentiality requirements, our group brainstormed and adapted to achieve maximum case ascertainment, contact of patients, and recruitment into the study. With the help of committed and dedicated clinicians, this approach was successful at some hospitals, but not all. Clearly, it places a burden on already busy clinical practices, and it is likely that a complete denominator was not available, due to patients overlooked or deemed not suitable for participation in the study by their physician. In our experience, this is not a practical way to conduct a study and, unless one can ascertain cases through pathology reports or medical records, the costs of such efforts through local hospitals may not justify the numbers of cases able to be accrued. In contrast, by working through the NJDHSS, an NCI SEER site, we capture all cases diagnosed within a circumscribed area and truly know the denominator of the study for calculation of response rates. An additional advantage is that information on tumor characteristics is available for non-participating cases.
The trade-off is in participation rates. In NYC, when women were personally apprised of the study by their physician, response rates were relatively high, with 75% of EA and 75% of AA women completing interviews and providing blood or saliva samples. However, we have no data on the number of women who were eligible for the study and were not approached by their physician, or those who requested not to be contacted by our study staff.
When contacted by the NJDHSS, response rates are lower but still remain satisfactory. For EA women, 73% agreed to be contacted by an interviewer, and 93% of those women were interviewed and provided a saliva sample, for a total participation rate of 68%. Participation was poorer for AA women in NJ; 60% agreed to be contacted by an interviewer when telephoned by staff from the NJDHSS, and of those, 90% were enrolled into the study, for a total participation rate of 54%. We have met approximately half of our accrual goal, to date, and efforts are constantly made to improve response rates.
In NJ, the study is truly population-based. Newly diagnosed patients from all hospitals in the 7 targeted counties are ascertained and contacted by the NJDHSS. These counties provide the population to be captured by RDD as well. In NY, we focused on those hospitals with the highest referral patterns for AAs in the 5 boroughs excluding Staten Island, and it is clear that coverage was not complete. While an average of 1273 cases per year are reported in AA women in the boroughs, we were only able to ascertain approximately 67 per year through working with clinicians in selected hospitals. We expect that the control sampling frame in NY results in a representative population, nonetheless, because the first three numbers of breast cancer patients seen in previous years at each hospital were used to obtain women in the same residential areas.
When confronted with difficulties in case ascertainment in NYC, we sought ways to expand eligibility criteria without compromising the integrity of the study. We initially limited eligibility for case participants to those between the ages of 20 to 64 years, primarily because of the low response rates using RDD for controls 65 years and older. In 2007, we extended the upper limit of age eligibility to 75 years for cases, but not controls. Although these older women cannot be used in case-control comparisons, they will allow for case-case analysis of younger versus older age at onset of breast cancer, in which age of the patient is the dependent variable. This will allow us to explore possible differences in study variables (e.g., aggressive versus non aggressive disease characteristics) between older breast cancer patients and younger breast cancer patients. We will also explore the possibility that such differences might differ by race/ethnicity groups and by other disease characteristics defined by pathology.
We had initially trained WCHS interviewers in phlebotomy and made consent for specimen collection a requirement of the study. Three tubes of blood were collected and processed, with straws stored with plasma, serum, red blood cells (RBC), and buffy coat for DNA extraction. Our intent was, when possible, to collect pretreatment blood samples to be able to compare biomarkers in cases and controls and for use later in studies of breast cancer prognosis. Because of the difficulties in accrual in NYC, and in planning approaches in NJ where we knew that we would not be able to coordinate specimen collection prior to initiation of cancer therapy, we decided to collect saliva as a source of DNA only, using Oragene Saliva DNA Self-Collection Kits when we began recruitment in NJ. Again, our ideal approach would be to have pretreatment blood specimens on all cases, but in the interests of cost and feasibility and what was viewed as long term utility of samples other than DNA, compromises had to be made. To date, we have serum, plasma, and RBCs banked on 261 AA and 197 EA controls as well as 198 and 147 AA and EA cases, respectively, which should provide us with capabilities to investigate, in a limited sample set, differences in biomarkers among controls only, and case control evaluations for markers that are not likely to be affected by surgery or adjuvant therapy. All other cases and controls provided saliva samples, and there are no participants in the study for whom a source of DNA is not available.

Results
As noted above, case ascertainment and accrual in NYC was terminated in 2008, and all efforts are now ongoing and focused on enrollment in NJ. Table 1 shows current recruitment numbers for cases and controls, by race, in NYC and in NJ. For the scope of this paper, we are reporting data on the subset of cases and controls who have questionnaire data which have been processed and verified through double data entry, which includes 858 controls and 1119 cases. In examining preliminary data through February 2009, there are notable differences by race/ethnicity among participants. Because we are still in data collection phase, we have made limited comparisons between cases and controls in this report. Rather, we have contrasted demographic and tumor characteristics among AA and EA women in our study samples. Among controls (Table 2), there are differences in country of birth, with more AAs born in the Caribbean. EAs are more likely to be married, to have graduated college, and to have employer-provided health insurance. Higher proportions of EA women have incomes above $90,000 per year and EA women have fewer pregnancies and at a later age than AAs. Rates of screening mammography are similar between AA and EA women without breast cancer (86% and 87%, resp.). Notably, AA controls are more likely to be overweight than EAs (30% versus 25%) or obese (52% versus 26%) but are less likely to use hormone replacement therapy (HRT) than EAs (15% versus 24%).
Demographic characteristics of cases (Table 3) and differences by race/ancestry are, for the most part, similar to distributions for controls in terms of birthplace, marital status, education, health insurance, and income. Twenty percent of AA women with breast cancer in our study either do not have health insurance (17%) or pay for insurance out of pocket (3%), compared to 12% of EA cases (4% with no insurance, 8% self-purchased). In contrast to controls, where use of mammography is similar by race/ancestry, only 78% of AA cases ever had a screening mammography, compared to 88% of EA women, and 51% of EA cases had their breast cancer discovered by mammography versus only 36% of AA women. There also appear to be greater differences by race/ancestry for hormonal and reproductive factors among cases than among controls. Twenty-nine percent of AA cases experienced menarche at or below age 12, compared to only 24% of EA women; these differences are not as notable among controls (27% versus 25%). African American cases also tend to have more children and at an earlier age than EA cases, similar to patterns observed among controls. As  observed for controls, AA women with breast cancer are also more likely to be overweight (31%) or obese (53%) than EA cases (26% and 26%, resp.) and are less likely to use HRT than EAs (15% versus 27%).
Of the pathology reports abstracted to date, the characteristics of tumors of women in our study are similar to those noted in literature [1]. African-American women are more likely than EA to have high-grade tumors (52% versus 32%) with ER negative (34% versus 22%) and PR negative (48% versus 34%) status. There are negligible differences by ancestry for HER2 status in our study population.
It is possible that differing methods of ascertainment and accrual could result in selection bias. We compared clinical and some epidemiological data between participants in NY and those in NJ. As shown in Table 4, AA cases from NY are more likely to have less than 11th grade education (22% versus 9%), more likely not to have health insurance (23% versus 9%), or be receiving Medicaid (21% versus 8%). Cases in NY had a lower incidence of DCIS (21% versus 13%), with invasive cancers being slightly higher (87% versus 79%). These differences may be due to the fact that, in New York, the majority of AA cases were ascertained at Kings County Hospital in Brooklyn which serves a large Caribbean community, many with low socioeconomic status, or because participation rates were higher in NY, resulting in some selection bias among those who agreed to be contacted in NJ.
For EA patients (Table 5), NY cases were more likely to be postgraduates (36% versus 22%) and but were less likely to have insurance (5% versus 2%) and receive Medicaid (4% versus 0%). Cases in NY were less likely to be obese (32% versus 22%) and had an older age at menarche (52% versus 42%).   Differences between controls in NY and NJ (Tables 6 and  7) showed some similar patterns as those for cases. NY AA controls were more likely to be on Medicaid (18% versus 10%) and were more likely to be obese (55% versus 34%). Similar differences were noted for EA controls.
It is difficult to ascertain the representativeness of our participants in relation to the underlying populations they were derived from. However, we did ask those who refused to be interviewed to complete a short telephone interview. In NY, cases who refused tended to be older >49, insured, either through Medicaid, Medicare, or employee-based insurance, have never taken hormone replacement therapy, and have had screening mammograms. Similar differences were noted for cases in NJ and for controls (insured, no HRT, and higher prevalence of screening mammograms). For controls, those who refused were more likely to have employer-provided insurance. The higher participation rates of cases in NY suggest that there would be less selection bias than in NJ, particularly for AA cases, because of lower participation rates in NJ. On the other hand, the population of cases in NY is somewhat skewed towards those treated at the County Hospital, where there is a large Caribbean population.

Discussion and Future Directions
When embarking on the conduct of a case-control study, a number of factors should be considered with respect to methodology. Uppermost in importance is feasibility, which is often overlooked by young, eager investigators. Although we recruited and interviewed over 500 cases through hospitals in NYC, the approach was often a struggle, and there is no question that case ascertainment through collaboration with a state SEER Cancer Registry is much more efficient.
Using this approach, we are currently interviewing over 60 women per month, with numbers expected to rise with additional interviewers hired. We are confident that we will reach our accrual goals within the next 24 to 36 months, with ample power to evaluate our main study hypotheses, yielding important information regarding the etiology of aggressive breast cancers among AA as well as EA women. Since initiating the study, scientific knowledge has advanced, and while our earlier aims were to categorize women according to age at onset, tumor grade, and ER status, we are currently reclassifying tumor grade based on readings from one pathologist and building TMAs with funding from the Breast Cancer Research Foundation to stain and read all tissue for ER, PR, and HER2 for assessment of triple negative breast cancers as well as cytokeratins 5 and 6 and HER1 to help classify basal-like breast cancers. The successful enrollment of cases and controls, and collection of tissue blocks, has also facilitated numerous collaborations for pooled studies to conduct genomewide association studies and to determine the extent of African admixture in relation to tumor characteristics. With tumor tissue DNA as well as TMAs in addition to the epidemiologic data and biospecimens, we will have numerous opportunities not only to address our primary hypotheses but also to address novel hypotheses regarding ethnic/racial disparities in breast cancer incidence and mortality.

Conclusion
Epidemiological research has become increasingly difficult with the growing concerns regarding privacy and legal issues.
To be able to address pressing issues in breast cancer research, particularly causal factors for the more aggressive breast cancers in AA women, creative strategies are required to conduct hospital and population-based studies. Partnership with SEER site is one approach for successful and complete case ascertainment and can facilitate the needed research in breast cancer disparities.