The objective of this project is to understand the causal relationships of how ecosystem dynamics, mostly characterized by vegetation changes, in different geographical areas with distinct eco-climatic variability, are affected by regulating climatic factors and other anthropogenic disturbances and extreme events. Although this is a well-studied problem, the state-of-the-art in this area has significant room for improvement. For example, climate variables such as precipitation, solar radiation, and temperature have traditionally been studied as limiting factors affecting plant vegetation growth. However, a quantitative analysis of the influence of unknown or unexpected driving forces on vegetation anomalies is still missing in the context of abrupt climate change (e.g. persistent drought, heat waves) and human-induced local events (e.g. forest fire, irrigation). Similarly, most prior studies have based their analyses on assumptions of linearity and certain types of nonlinearity in the dependency relationships between observed/modeled climate variables and satellite-derived vegetation indices. We hypothesize that such assumptions may not hold true in practice when scaled over large regions, thereby rendering the non-generalized models and the understanding of ecosystem dynamics potentially misconstrued. In this study we propose to use a regression technique called 'symbolic regression' for learning these complex time-space relationships and their evolution over time. This genetic programming based learning technique has demonstrated the potential of discovering new dependency structures among variables that were previously unknown. Using symbolic regression, we not only expect to uncover new relationships among well-studied climate variables, but also identify latent factors responsible for vegetation anomalies. We will use NASA's high-end computing and data infrastructure at the NASA Earth Exchange facility in order to scale this evolutionary optimization based regression technique to build global prediction frameworks using hierarchical and ensemble approaches. The benefits of this work will be in improving the understanding of the ecosystem dynamics and generalizing those understandings from regional to global scales. The work will leverage the results of previous NASA-funded efforts, both in terms of data sets and computing infrastructure. The work will aim to answer three science questions starting at the local level and then moving to a global scale. They are: 1) what is the magnitude and extent of ecosystem exposure, sensitivity and resilience to the 2005 and 2010 Amazon droughts, 2) what are the human-induced and other attribution factors that cause vegetation anomalies in certain geographical regions that cannot be otherwise explained by the natural climate variability, and, 3) how does the learned dependency of vegetation on the climate variables and other exogenous factors vary across different eco-climatic zones and geographical regions on a global scale? This project will develop algorithms that will answer the three questions listed above with the help of the domain scientist's validation. This modeling exercise based on symbolic regression is the first of its kind for earth science applications. Therefore, entry TRL of the project is 2. On successful completion of milestones, the exit level TRL is expected to be 4, where the capabilities have been validated at a global scale. The interdisciplinary team includes expertise in large-scale data mining, symbolic regression, evolutionary optimization, and Earth science, to meet the technical challenges of this project.