Optimizing large knowledge networks in spatial computers

Dominic Pacher; Robert Binna; Günther Specht; Dominic Pacher; Robert Binna; Günther Specht

doi:10.1017/S0269888916000187

2016 Volume 31

Article Contents

Next Previous

RESEARCH ARTICLE Open Access

Optimizing large knowledge networks in spatial computers

Databases and Information Systems

More Information

Published online: 07 December 2016
The Knowledge Engineering Review 31, Article number: 10.1017/S0269888916000187 (2016) | Cite this article

Abstract

Abstract: This paper presents a novel concept of a Spatially Aware Graph Store, which realizes a Graph Store on top of a spatial computer architecture to manage graphs in one, two or three physical dimensions. In this environment, the physical distance between graph nodes strongly affects graph traversal performance. Consequently, a Spatially Aware Graph Store needs to minimize these distances to operate efficiently. We show that this minimization can be achieved in two ways. First, by increasing the dimensionality of the spatial computer and second by applying optimization methods. For the latter, this work introduces a novel Mid Point Optimization method to quickly optimize large real-world knowledge networks by rearranging nodes in a way that distances between linked nodes are reduced. In addition, a Local Optimization method is subsequently applied to refine the result. Finally, the Node Decomposition method is presented that splits nodes with many edges into several smaller nodes to achieve a further reduction of distances between linked nodes.Our results show that the overall distances between nodes can be reduced by three orders of magnitude for 3D in comparison to one-dimensional (1D) Spatially Aware Graph Stores. The suggested Mid Point Optimization method achieves a reduction by another order of magnitude. In a 3D spatial computer, Local Optimization is capable of reducing distances by another 20%. However, in 1D and 2D spatial computers it becomes a prohibitive time consuming method. Finally, the Node Decomposition enables an additional distance reduction by 40% in Scale Free Graph Data sets.
Rights and permissions
© Cambridge University Press, 2016 2016Cambridge University Press

References

Abadi, D. J., Marcus, A., Madden, S. R. & Hollenbach, K. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases, 411–422. VLDB Endowment.

Google Scholar

Abou-Rjeili A. & Karypis G. 2006. Multilevel algorithms for partitioning power-law graphs. In Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, IPDPS’06, 10pp. IEEE.

Google Scholar

Binna, R. et al. (2010). SpiderStore: exploiting main memory for efficient RDF graph representation and fast querying. In Proceedings of the Workshop on Semantic Data Management (SemData). VLDB.

Google Scholar

Brandt A., Ron D. & Amit D. 1986. Multi-level approaches to discrete-state and stochastic problems. Multigrid Methods II 1228, 65–98.

Google Scholar

Broekstra J., Kampman A. & van Harmelen F. 2002. Sesame: a generic architecture for storing and querying RDF and RDF schema. In Semantic Web – ISWC’02, 2342, 54–68. Aidministrator Nederland BV.

Google Scholar

Butera B. 2002. Programming a paintable computer. PhD thesis, Massachusetts Institute of Technology.

Google Scholar

DeHon A., Giavitto J.-L. & Gruau F. 2007. Executive Report. In: Dagstuhl Seminar Proceedings, 06361—Computing Media and Languages for Space-Oriented Computation.

Google Scholar

Delorimier M. & Kapre N. 2006. GraphStep: a system architecture for sparse-graph algorithms. In Field-Programmable Custom Computing Machines, 2006, FCCM’06, 14th Annual IEEE Symposium on IEEE, 143–151.

Google Scholar

Delorimier, M., Kapre, N., Mehta, N. & Dehon, A. (2011). Spatial hardware implementation for sparse graph algorithms in GraphStep. ACM Transactions on Autonomous and Adaptive Systems, 6. 1-20.

Google Scholar

Erling O. 2008. Towards web scale RDF. In The 4th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS).

Google Scholar

Erling O. & Mikhailov I. 2009. RDF support in the Virtuoso DBMS. Networked Knowledge – Networked Media. Studies in Computational Intelligence. 221, 7–24.

Google Scholar

Fishburn P., Tetali P. & Winkler P. 2000. Optimal linear arrangement of arectangular grid. Discrete Mathematics 213, 123–139.

Google Scholar

Garey M. R. & Johnson D. S. 1979. Computers and Intractability; A Guide to theTheory of NP-Completeness. W. H. Freeman & Co.

Google Scholar

Harel D. & Koren Y. 2004. Graph drawing by high-dimensional embedding. Journal of Graph Algorithms and Applications 8, 195–214.

Google Scholar

Harris S., Lamb N. & Shadbolt N. 2009. 4store: the design and implementation of a clustered RDF store. In 5th International Workshop on Scalable Semantic Web KnowledgeBase Systems (SSWS2009), 94–109.

Google Scholar

Harth, A., Umbrich, J., Hogan, A. & Decker, S. (2007). Yars2: a federated repository for searching and querying graphstructured data. In Proceedings of the 6th International Semantic Web Conference—ISWC 06, 211–224.

Google Scholar

Hu Y. 2006. Efficient, high-quality force-directed graph drawing. The Mathematica Journal 10, 37–71.

Google Scholar

Huang J., Abadi D. J. & Ren K. 2011. Scalable SPARQL querying of large RDF graphs. In Proceedings of the VLDB Endowment, 4, 1123–1134.

Google Scholar

Husain M. & McGlothlin J. 2011. Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Transactions on Knowledge and Data Engineering 23, 1312–1327.

Google Scholar

Janik M. & Kochut K. 2005. BRAHMS: a work bench RDF store and high performance memory system for Semantic Association Discovery. In Fourth International Semantic Web Conference, 431–445.

Google Scholar

Johnson, D. S., Aragon, C. R., Mcgeoch, L. A. & Schevon, C. (1989). Optimization by simulated annealing: an experimental evaluation; part I, graph partitioning. Operations Research 37, 865-892.

Google Scholar

Koren Y. & David H. 2002. A multi-scale algorithm for the linear arrangement problem. In Graph-Theoretic Concepts in Computer Science, 296–309.

Google Scholar

Mondal J. & Deshpande A. 2012. Managing large dynamic graphs efficiently. In Proceedings of the 2012 International Conference on Management of Data—SIGMOD 12, 145–156.

Google Scholar

Neumann T. & Weikum G. 2008. RDF-3X: a RISC-style engine for RDF. In Proceedings of the VLDB Endowment 1, 647–-659.

Google Scholar

Nguyen, Q., Eades, P., Hong, S. & Huang, W. (2011). Large crossing angles in circular layouts. In Proceedings of the18th International Conference on Graph Drawing, 397–399.

Google Scholar

Ousterhout, J., Agrawal, P., Erickson, D., Kozyrakis, C., Leverich, J., Mazières, D., Mitra, S., Narayanan, A., Ongaro, D., Parulkar, G., Rosenblum, M., Rumble, S., Stratmann, E. & Stutsman, R. (2011). The case for RAM Cloud. Communications of the ACM 54, 121.

Google Scholar

Petit J. 2003. Experiments on the minimum linear arrangement problem. Journal of Experimental Algorithmics 8, 112–128.

Google Scholar

Rodriguez-Tello E., Hao J.-K. & Torres-Jimenez J. 2006. A refined evaluation function for the MinLA problem. In MICAI2006: Advances in Artificial Intelligence, 392–403.

Google Scholar

Rodriguez-Tello E., Hao J.-K. & Torres-Jimenez J. 2008. An effective two-stage simulated annealing algorithm for the minimum linear arrangement problem. Computers & Operations Research 35, 3331–3346.

Google Scholar

Safro I., Ron D. & Brandt A. 2006. Graph minimum linear arrangement by multilevel weighted edge contractions. Journal of Algorithms 60, 24–41.

Google Scholar

Safro I., Ron D. & Brandt A. 2009. Multilevel algorithms for linear ordering problems. Journal of Experimental Algorithmics 13, 1.4:1–1.4:20.

Google Scholar

Shao B., Wang H. & Li Y. 2012. The Trinity Graph Engine. Technical Report, Microsoft Research.

Google Scholar

Shvachko K. & Kuang H. 2010. The hadoop distributed file system. In IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 1–10.

Google Scholar

Walshaw C. 2003. A multilevel algorithm for force-directed graph drawing. Journal of Graph Algorithms and Applications 7, 253–285.

Google Scholar

Wilkinson, K., Sayers, C., Kuno, H. & Reynolds, D. (2003). Efficient RDF storage and retrieval in Jena2. In Proceedings of the 1st International Workshop on Semantic Web and Databases, 35–43.

Google Scholar

Wulf W. A. & McKee S. A. 1995. Hitting the memory wall: implications of the obvious. ACMSIGARCH Computer Architecture News 23, 20–24.

Google Scholar

Zeng, K., Yang, J., Wang, H., Shao, B. & Wang, Z. (2013). A distributed graph engine for web scale RDF data. In Proceedings of the 39th International Conference on Very Large Data Bases, 265–276.

Google Scholar

About this article

Cite this article

Dominic Pacher, Robert Binna, Günther Specht. 2016. Optimizing large knowledge networks in spatial computers. The Knowledge Engineering Review. 31:187 doi: 10.1017/S0269888916000187

Dominic Pacher, Robert Binna, Günther Specht. 2016. Optimizing large knowledge networks in spatial computers. The Knowledge Engineering Review. 31:187 doi: 10.1017/S0269888916000187

Download PDF

Article Metrics

Article views(79) PDF downloads(58)

{{lists.name}}

Optimizing large knowledge networks in spatial computers

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors