I have data. We sent off one of my samples for sequencing and got the results back a couple of days ago. Now I need to start learning what to do with this type of data. It is really exciting to have this data because, while it is still very preliminary, it means that I have a springboard to start thinking about what might be going on in the environments that I am studying. It also means that our process of DNA extraction and amplification (from very difficult samples) worked, at least for this sample.
I like that scientific research comes in phases. First you have to plan out your experiments or your sample analyses. This often involves a lot of background research to figure out what other people have done to ask similar questions, and how they did it. Then you have to do the actual lab or field work. In the case of the lab work I have been doing, a lot of this feels like one step forward two steps backward because each time we figure one thing out in our procedure, there is another issue to deal with. We spend a lot of time trying to figure out why our extractions or DNA amplification isn't working, and deciding which of the 20 factors to tweak the next time around. If we get luck and pick the right one, things progress fast. If not, we can spend weeks or even months trying to navigate around some road block that is standing between us and the data we seek. Once samples have been collected, experiments have been carried out, experimental samples have been preserved or analyzed we are ready to proceed to the next phase... data analysis.
Once there is data in the picture the game changes. A new set of challenges arise because the goal in this phase is to figure out what the data are telling you. Sometimes this is frustrating because there is no clear story, and sometimes you realize you need to back up and get more data or slightly different data in order to really understand what is going on. In the world of genetic sequences that task becomes figuring out what hundreds of thousands of A's, C's, T's, and G's mean. There are databases to help you figure out what organisms are in your samples, but in environments like hydrothermal vents where so many of the microbes are uncultured and unknown these databases are of limited use. So now it falls to me to learn a new set of skills that involve computer savvy (using new programs and platforms), a thorough understanding of genetics, and I'm not sure what else. Bioinformatics... here I come!
After preliminary data typically comes more experiments and additional data collection. Eventually you decide you have enough data to tell a compelling story and then the next phase begins... writing. That one is a long way off, but it is the ultimate goal: to write up your results and get them published. In reality these phases often co-occur if you are working on various projects or various aspects of the same project. The way people do science very rarely occurs in the way that middle school science text books describe the "scientific method", but to me these phases represent different mindsets, and the transition from one to the next makes me feel like I have accomplished something.
I am excited because this data provides a peak into the next phase of my science... data analysis. It will be fun to not just be doing lab work for a while. It is intimidating and exciting to have a whole new set of skills (bioinformatics) to begin learning. This whole process has been one steep learning curve after another. It keeps you busy, and transition between phases prevents boredom.