| Particulars |
Details |
| |
| Title : |
Design and implement support for better statistics collection in relational DBMS. |
| |
| Group Members : |
Following are the group members: |
| |
|
| |
| Abstract : |
The project will implement different algorithms to calculate:
|
| |
- Cardinality of large datasets.
- Count of distinct values of an attribute.
The relative merits and demerits of these algorithms will be compared.
Since we plan to use xgraph utility to plot graphs, we assume that the utility is supported on target environment.
|
| |
| Deliverables : |
The project will have following features: |
|
- A working package comprising of
- Comparison of various algorithms for finding cardinality of large datasets and number of distinct values of an attribute.
- Comparison of running time, accuracy and space utilisation.
- Comparison statistics and inferences as part of project report.
|
| |
| Sources of Information : |
Following papers have been referenced : |
| |
- Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports, by Phillip B. Gibbons, VLDB 2001.
- Probabilistic Counting Algorithms for Database Applications, Philippe Flajolet and G. Nigel Martin.
- Synopsis Data Structures for Massive Datasets, Phillip B. Gibbons and Yossi Matias.
|