Reason #1: Because it can help Humanity.
Millions of people suffer each year from psychiatric and neurological diseases. Often, these diseases are only diagnosed properly after an acute event happens - like a psychotic hospitalization. This leads to a large amount of suffering for patients and families, as well inefficienceis in the medical system.
By contributing your voice sample you can help build a panel of diagnostics that can be used to detect these diseases earlier, before full-blown symptoms appear. This can, as a result, reduce the time to diagnose patients from years to minutes and costs from thousands of dollars to tens of dollars.
We believe that contributing your voice sample is much like contributing a blood sample as a part of a blood drive, except that you will be contributing to a rich set of research purposes that can scale much more than a blood sample. With your voice donation you can affect millions, if not hundreds of millions, of people later on.
Reason #2: You'll get a cool health report.
At the end of the study, we will send you a report of how your data compares to the normative population. This will include things like relative health traits relating to various diseases and/or general statistics about height/weight measurements. Companies like 23andMe charge for these reports, but we'll make these available to you free of charge as an incentive to stick through the study to the very end.
Reason #3: You'll have the potential to win $5,000 cash.
If you enroll in the study and provide multiple voice samples (over 1 month period), you will have a chance to win $5,000 cash. The more engaged you are, the higher likelihood you will have to win the prize. For example, if you submit 4 voice samples over a month you will have a 4x chance to win the $5,000 cash prize at the end of the program. In this way, we hope to incentivize you to complete the study.
Your personal health information will be handled securely. All data collected on our servers is encrypted end-to-end. We will anonymize the data and use it with the intention to publish research papers and translate this work to patients.
We have structured the trial as a 4 week longitudinal study and we estimated it will take roughly 15-20 minutes of your time.
The first week is the most time-intensive, where we will ask a 10 minute inventory and collect voice data from a few tasks in series.
During the second, third, and fourth weeks, we will just follow up with a few short survey questions and some voice samples, each taking 2-3 minutes each.
You can use any laptop, tablet, or mobile phone to complete this study. The only strict requirement is that you have a stable internet connection to submit voice samples.
In general, the best way to provide a voice sample is in a quiet room where you are speaking close to the microphone. If this is possible, please take the survey in a quiet room. Otherwise, your voice sample may be contaminated and discarded from scientific analysis.
Our goal is to enroll between 10,000 - 100,000 patients in 2019-2020 to collect as much data as possible with voice information tied to health traits.
November 1st, 2019 - December 30th, 2020.
Since the study lasts one month long for patients, the last day to enroll in the study is December 1st, 2020.
The study is designed to be longitudinal to take a voice sample at baseline and track patients week-over-week (WoW) for 1 month.
To incentivize patients to complete the study, we are increasing the likelihood for the cash prize based on the number of submissions and only provide a health report for those who complete the study.
We do know that this does place some limitations (e.g. Multiple Sclerosis episodes are better captured over yearly intervals than weekly intervals), but we believe this design provides the most ROI while covering the widest coverage of health traits.
We're seeking collaborations with individuals, nonprofits, academic facilities, data scientists, pharma/medical device companies, and private donors to advance this initiative.
For Individuals - You can donate your voice as a part of this study at this link. Anyone can partake in the study, we just ask that you answer all questions honestly and to the best of your ability.
For Nonprofits and Academic Facilities - You can send out the link to enroll in the clinical study as a part of a mailing list or embed snippets to be recruited on their website. We have many of these links and descriptions already made to make this step very easy for you.
For Data Scientists - If you are an independent data scientist (e.g. within Google Research), you can partner with us to help analyze our datasets for use in academic publications.
For Pharma/Medical Device Companies - If you are a pharma company looking to collect voice data as a part of your clinical trials, we encourage you to use the SurveyLex product that we have created for this purpose or channel patients directly to this website to partake in our clinical trial.
For Private Donors - If you're interested to personally support our work through a foundation or another vehicle, please reach out to us. We've thought about how we can make this happen through nonprofit collaborators.
If you're interested in any of these areas or have other ways to work together, just reach out to us on the webform at the bottom of this page and we'll get back to you promptly.
To take part of the study you must be a healthy adult (>18 years of age) in the United States of America that can produce speech vocalizations in English. If you do not fit this criteria, you cannot participate in the study.
Simply, the Voiceome is the smallest speech sample that emits the most amount of health meaning. The Voiceome fundamentally has two elements: a speech sample (e.g. a voice sample collected from a prompt like “describe your day from start to finish”) and features (e.g. mfcc coefficients). This is analogous to genetic code requiring a sample (e.g. hair follicle) and sequencing method for features (e.g. library prep and genetic sequencing).
With the Voiceome you can then use machine learning models on speech features to infer health conditions - like whether someone is depressed or not through the fundamental frequency and speaking rate. This is analogous to how you can use genes to infer various health conditions (e.g. if you have an APOE4 allele it increases your risk for Alzheimer's and lowers the age of onset).
By completely mapping the Voiceome, we can translate the science of vocal biomarkers to patients faster.
Here are some speech tasks that are typically used for vocal biomarker research that we are using in this particular study:
1. Sentence repeating task (10 seconds). Speak this sentence out loud (with text disappearing on screen): “The quick brown fox jumped over the lazy dog.” Measures memory, attention, and phonetic ability.
2. Phoneme repeating task (10 seconds). Repeat ‘Ahhhhhh’ out loud. Measures spectral power, which is useful for Parkinson’s disease.
3. Picture description task (30 seconds). Describe this picture. [from a standard picture bank]. Measures free speech and higher-order thought associations.
4. Free speech task (30 seconds). Describe your day from start to finish. Another free speech task.
5. Gratitude task. (30 seconds). What are you grateful for today? Measures contrasting emotions (e.g. happy vs. sad).
6. Verbal fluency task. (30 seconds) - Describe as many animals as you can in 30 seconds.
7. Countback task. (20 seconds) - Count down from 300 until the time expires.
For more information on how to select the most optimal speech tasks for your given problem, check out this lookbook.
We commonly get asked how many and what types of features we extract to build machine learning models. Simply, there are 4 main types of features that we can extract:
1. Audio features - acoustic features extracted from an audio file (e.g. fundamental frequency)
2. Text features - language features extracted from a transcript (e.g. noun frequency)
3. Mixed features - Mixing up audio features and text features - often as a ratio (e.g. speaking rate).
4. Meta features - features derived from machine learning models on any of the embeddings described above (e.g. age - 20s).
You can learn more about features in this GitHub repository or the this voice computing textbook.
Right now we’re training mostly simple classification models because many of the datasets that we have curated internally are relatively low sample sizes (e.g. 100-1000 labels per class). Some of the techniques used to model the data include:
- Naive Bayes (NB)
- Decision tree
- Support vector machines (SVM)
- Maximum entropy
- Gradient boost
- Logistic regression
- Hard voting
- K nearest neighbors (knn)
- Random forest
However, for larger datasets we build and optimize models with deep learning techniques (e.g. attention-based neural networks). One of the goals of the Human Voiceome Study is to create a standard feature embedding for voice health related research. If we receive 100,000 entries we can then create a standard feature embedding using autoencoder neural networks to detect a normal from an abnormal individual with health trait labels. Then, for future studies in this space this standard feature embedding can be used to help featurize and model data in the field.
We have created an Innovation Fellows Program to engage outstanding scientists externally to analyze the data collected from the Human Voiceome Project. This program is structured as a competition over a 6 month period. If this is something that interests you, feel free to apply @ innovate.neurolex.co.