Immunology and API's

I'm applying for a job outside of my field. More concretely, I found a job offer at the German Cancer Research Center as a Data Scientist. The task is to work on an immunology API https://docs.airr-community.org/en/stable/packages/airr-python/api.html for openly sharing the genetic sequences and associated meta-data of cells in the adaptive immune system, e.g. the system that maintains a memory of the pathogen and that vaccines are based on. So an API is a piece of software that allows the exchange between a client and a server. You send a message to the server, it does something and sends back some data. The communication between the client in the server is controlled. The server speaks its own language and in order for successful communication to happen you have to send it something that it understands. REST is a style of designing programs that can do cool things with client requests, designing web communication. It enables programmatic access to data that is stored somewhere. For example, Twitter has an REST API that you can query to get 1% of the World's tweets so that you can study them or display them on your very own website. When machine learning models are deployed, for example using AWS SageMaker, you create an API where a client can send some data and that runs the data through the trained model and returns a prediction. So API's are cool and they're input and output machines! Learn more about bioinformatics APIs here https://www.ebi.ac.uk/training-beta/online/courses/embl-ebi-programmatically/introduction-to-programmatic-access/ 

What charms me about this job is that it enables controlled and organized sharing of scientific data. I adore open data! My entire career is enabled and built on free access to scientific data. When I was in my MSc my first publications happened because I found an interesting gene expression dataset on a bioinformatics database https://www.ebi.ac.uk/arrayexpress/ I thought of a way to apply a machine learning algorithm on it, ran a literature search and wrote to a scientist who did cool things with gene expression data. We ended up publishing 3 papers together. For my MSc thesis I worked on Allen Brain Observatory open calcium imaging data http://observatory.brain-map.org/visualcoding And then I went for a neural data mining course in Berkeley and met professor Maneesh Sahani from UCL who recommended me an open calcium imaging data set from Carsen and Marius https://figshare.com/articles/dataset/Recordings_of_ten_thousand_neurons_in_visual_cortex_in_response_to_2_800_natural_images/6845348 I played with the data and applied some algorithms to it and wrote to Carsen and they hired me to work as a coder on the calcium imaging processing pipeline suite2p and I ended up working on a clustering algorithm with them and wrote a visualization plugin that will hopefully be merged soon (just have to make a few fixes:-)) https://github.com/MouseLand/suite2p/pull/520

 So check out all the cool API's available for bionformatics here https://www.ebi.ac.uk/training-beta/online/courses/embl-ebi-programmatically/introduction-to-programmatic-access/ 

So the point is that data is cool. Scientific data is just the best. And if we could standardize our work and pool our efforts we just might find the cure for cancer and schizophrenia (l).

Kommentaarid