Orion 2.0 supports both continuous and discrete uncertainty,
which are represented using their pdfs.
This has been illustrated in our sensor data example in Table 1 (continuous uncertainty) and
Table 2 (discrete uncertainty).
The uncertainty in many applications
can be expressed using standard distributions. Orion has
built-in support for commonly used continuous (e.g. Gaussian, Uniform) and discrete (e.g. Binomial, Poisson)
distributions. These uncertain values are processed symbolically
in the database. For example, when the Gaussian distribution is used, we only store the mean and variance
as the two parameters of the distribution, as shown in Table 1.
When the underlying data cannot be represented using standard distributions, Orion automatically
converts them to approximate distributions, including histograms and discrete sampling.
The histogram distribution consists of buckets over the data
domain, along with the probability density in each bucket.
The discrete sampling simply consists of multiple value probability
pairs. The bin size (or number of sampling points)
is an important parameter that decides the trade-off between
accuracy and efficiency.
Rohit Jain
2011-08-02