There has been a lot of reading and searchnig recently, with no particular results. As people of science keep repeating, lack of results is a result in itself, so I’ll just write down what was happening to try to understand where to go further.

I was trying to find a field of Statistics or Machine Learning which deals with data gathered from different data sources.

  1. Asked the professor who deals with Missing Data. Doesn’t know of any research in that field.

  2. Aksed the professor who deals with Regression. Same.

  3. Asked people on CrossValidated. Nothing rings the bell.

The closest resource I could find that talks about dealing with multiple data sources when there is an explicit knowledge about their structure is Ontology Summit.

Also, there are some pointers to places where ontologies or other knowledge respesentation could be used with data analysis:

  1. Graphical Models: DAGs represent causal relationsips between variables.

  2. Gene Ontology is somehow used in data analysis.

But the main idea I got from searching for connections between Statistics and Ontologies is that the latter is way harder to find. Hence, if searching for data analysis problems in a domain where Ontologies are used must be easier than searching for usage of Onotolgies over all data analysis applications.

By now, situation is pretty severe. I have an idea about some problem being very important to solve, but cannot find any people solving it. This can only be resolved by me finding data to solve the problem myself. Data should be probably from the data integration world, for the reason highlighted earlier, and the offer should be to perform statistical analysis on it. I’m trying to look into multiple projects for integrating biomedical data, included automated diagnosis projects. I should also take a look at different industries.

More news to come.