SearchAllVariables

Creates a numeric search for each variable to find the best model that describes its dependency from other variables.

Choose a set of variables to create a search for.

from eureqa import Eureqa
eureqa = Eureqa(url='http://localhost:10002', user_name='root', password='Changeme1')
data_source = eureqa.create_data_source('Sample_data_1', '../sample.csv')
variables = set(data_source.get_variables())
variables
{u'(Calories by Breakfast (%))',
 u'(Calories by Dinner (%))',
 u'(Calories by Lunch (%))',
 u'(Calories by Snacks (%))',
 u'(Carbohydrates (g))',
 u'(Exercise Calories)',
 u'(Fat (g))',
 u'(Over/Under Cal Budget)',
 u'(Protein (g))',
 u'Calories',
 u'Date',
 u'Steps',
 u'Weight'}

For each variable, start a search to find the best model for it

searches = []
for variable in variables:
    settings = eureqa.search_templates.numeric('Test_5', variable, variables - {variable})
    search = data_source.create_search(settings)
    search.submit(1)
    searches.append(search)
searches
[<eureqa.search.Search instance at 0x10bef43f8>,
 <eureqa.search.Search instance at 0x10bf751b8>,
 <eureqa.search.Search instance at 0x10bebcc68>,
 <eureqa.search.Search instance at 0x10bef4638>,
 <eureqa.search.Search instance at 0x10bf64c20>,
 <eureqa.search.Search instance at 0x10bf64128>,
 <eureqa.search.Search instance at 0x10bf64bd8>,
 <eureqa.search.Search instance at 0x10bf8fdd0>,
 <eureqa.search.Search instance at 0x10bf8ff38>,
 <eureqa.search.Search instance at 0x10bebcdd0>,
 <eureqa.search.Search instance at 0x10bef48c0>,
 <eureqa.search.Search instance at 0x10bf55e60>,
 <eureqa.search.Search instance at 0x10bf8f440>]

Wait for all the searches to finish. Print out their results as they finish.

for search in searches:
    search.wait_until_done()
    best_solution = search.get_best_solution()
    print 'Best model for the %s variable is %s' % (best_solution.target, best_solution.model)
Best model for the (Exercise Calories) variable is 0.0473766837289552*min(Steps, 10.9011910154713*Steps) - 13.4250563890874
Best model for the (Calories by Breakfast (%)) variable is 16.3939916080802
Best model for the Weight variable is 229.770341796561 - 0.17912786440584*max(Date, 54.179697426376)
Best model for the (Calories by Snacks (%)) variable is 32.4959206475677 - 0.434613605762846*(Calories by Dinner (%))
Best model for the (Calories by Lunch (%)) variable is 47.9898629072862 - 0.325398892210761*(Calories by Dinner (%)) - 0.0165536988616873*(Calories by Dinner (%))*(Calories by Snacks (%))
Best model for the (Over/Under Cal Budget) variable is 2384.70928721722 - 1.7675145189278*(Fat (g)) - 0.877630867725491*max((Exercise Calories), Calories)
Best model for the Calories variable is 684.172924912191 + 7.73282441233836*(Fat (g)) + 3.50549054922923*(Carbohydrates (g))
Best model for the (Fat (g)) variable is 3.36917606184775*sqrt(Calories) - 82.0612962083521
Best model for the Steps variable is 33.1852760489287*max((Exercise Calories), 33.3499820047831) - 1011.65433350096 - 0.0160082990524103*(Exercise Calories)*max((Exercise Calories), 1.68285174614339)
Best model for the (Protein (g)) variable is 43.4255675982683 + 0.322019477931614*(Fat (g))
Best model for the Date variable is 65.4984572007566
Best model for the (Calories by Dinner (%)) variable is 74.1390788664002 - 0.743537429629981*(Calories by Lunch (%)) - 0.866427542370519*(Calories by Snacks (%))
Best model for the (Carbohydrates (g)) variable is 253.280411230336 + 0.418048253927553*(Fat (g)) - 0.0626884556883835*(Over/Under Cal Budget)

Plot the resulting error-metric values

%matplotlib inline
import matplotlib.pyplot as plt

error_metric = searches[0].error_metric
solutions = [search.get_best_solution() for search in searches]

plt.plot(range(len(solutions)), [s.get_error_metric_value(error_metric) for s in solutions])
plt.title('Solution quality')
plt.show()
../_images/SearchAllVariables_8_0.png