Skip to main content
Log in

Garden: a real-time processing framework for continuous top-k trajectory similarity search

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Continuous top-k trajectory similarity Search (CkSearch) is now commonly required in real-time large-scale trajectory analysis, enabling the distributed stream processing engines to discover various dynamic patterns. As a fundamental operator, CkSearch empowers various applications, e.g., contact tracing during an outbreak and smart transportation. Although extensive efforts have been made to improve the efficiency of non-continuous top-k search, they do not consider dynamic capability of indexing (R1) and incremental capability of computing (R2). Therefore, in this paper, we propose a generic CkSearch-oriented framework for distributed real-time trajectory stream processing on Apache Flink, termed as Garden. To answer R1, we design a sophisticated distributed dynamic spatial index called Y-index, which consists of a real-time load scheduler and a two-layer indexing structure. To answer R2, we introduce a state reusing mechanism and index-based pruning methods that significantly reduce the computational cost. Empirical studies on real-world data validate the usefulness of our proposal and prove the huge advantage of our approach over state-of-the-art solutions in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. Amato G, Savino P (2008) Approximate similarity search in metric spaces using inverted files. In: Lempel R, Perego R, Silvestri F (eds) 3rd International ICST conference on scalable information systems, INFOSCALE 2008, Vico Equense, Italy, June 4–6, 2008, p 28. https://doi.org/10.4108/ICST.INFOSCALE2008.3486

  2. Beckmann N, Kriegel H, Schneider R et al (1990) The r*-tree: an efficient and robust access method for points and rectangles. In: Garcia-Molina H, Jagadish HV (eds) Proceedings of the 1990 ACM SIGMOD international conference on management of data, Atlantic City, NJ, USA, May 23–25, 1990, pp 322–331. https://doi.org/10.1145/93597.98741

  3. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517. https://doi.org/10.1145/361002.361007

    Article  MATH  Google Scholar 

  4. Cai R, Lu Z, Wang L et al (2017) Ditir: distributed index for high throughput trajectory insertion and real-time temporal range query. Proc VLDB Endow 10(12):1865–1868. https://doi.org/10.14778/3137765.3137795

  5. Chen L, Ng RT (2004) On the marriage of lp-norms and edit distance. In: Nascimento MA, Özsu MT, Kossmann D et al (eds) (e)Proceedings of the thirtieth international conference on very large data bases, VLDB 2004, Toronto, Canada, August 31–September 3 2004, pp 792–803. https://doi.org/10.1016/B978-012088469-8.50070-X

  6. Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Özcan F (ed) Proceedings of the ACM SIGMOD international conference on management of data, Baltimore, Maryland, USA, June 14–16, 2005, pp 491–502. https://doi.org/10.1145/1066157.1066213

  7. Fang J, Zhao P, Liu A et al (2019) Scalable and adaptive joins for trajectory data in distributed stream system. JCST 34(4):747–761. https://doi.org/10.1007/s11390-019-1940-x

    Article  Google Scholar 

  8. Fazzinga B, Flesca S, Furfaro F et al (2014) Cleaning trajectory data of rfid-monitored objects through conditioning under integrity constraints. In: EDBT, pp 379–390

  9. Finkel RA, Bentley JL (1974) Quad trees: a data structure for retrieval on composite keys. Acta Inform 4:1–9. https://doi.org/10.1007/BF00288933

    Article  MATH  Google Scholar 

  10. Frentzos E, Gratsias K, Theodoridis Y (2007) Index-based most similar trajectory search. In: Chirkova R, Dogac A, Özsu MT et al (eds) Proceedings of the 23rd international conference on data engineering (ICDE 2007), The Marmara Hotel, Istanbul, Turkey, April 15–20, 2007, pp 816–825. https://doi.org/10.1109/ICDE.2007.367927

  11. Fu AW, Chan PM, Cheung Y et al (2000) Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances. VLDB J 9(2):154–173. https://doi.org/10.1007/PL00010672

    Article  Google Scholar 

  12. Fu YC, Hu ZY, Guo W et al (2003) Qr-tree: a hybrid spatial index structure. In: Proceedings of the 2003 international conference on machine learning and cybernetics (IEEE Cat. No.03EX693) vol 1, pp 459–463. https://doi.org/10.1109/ICMLC.2003.1264521

  13. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Yormark B (ed) SIGMOD’84, proceedings of annual meeting, Boston, Massachusetts, USA, June 18–21, 1984, pp 47–57. https://doi.org/10.1145/602259.602266

  14. Jeung H, Lu H, Sathe S et al (2014) Managing evolving uncertainty in trajectory databases. IEEE Trans Knowl Data Eng 26(7):1692–1705. https://doi.org/10.1109/TKDE.2013.141

    Article  Google Scholar 

  15. Kamel I, Faloutsos C (1994) Hilbert r-tree: An improved r-tree using fractals. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of 20th international conference on very large data bases (VLDB’94), September 12–15, 1994, Santiago de Chile, Chile, pp 500–509. http://www.vldb.org/conf/1994/P500.PDF

  16. Leutenegger ST, Edgington JM, López MA (1997) STR: A simple and efficient algorithm for r-tree packing. In: Gray WA, Larson P (eds) Proceedings of the thirteenth international conference on data engineering, April 7–11, 1997, Birmingham, UK, pp 497–506. https://doi.org/10.1109/ICDE.1997.582015

  17. Li X, Zhao K, Cong G et al (2018a) Deep representation learning for trajectory similarity computation. In: 2018 IEEE 34th international conference on data engineering (ICDE), pp 617–628. https://doi.org/10.1109/ICDE.2018.00062

  18. Li X, Zhao K, Cong G et al (2018b) Deep representation learning for trajectory similarity computation. In: 34th IEEE international conference on data engineering (ICDE 2018), pp 617–628. https://doi.org/10.1109/ICDE.2018.00062

  19. Ma C, Lu H, Shou L et al (2013) KSQ: top-(k) similarity query on uncertain trajectories. IEEE Trans Knowl Data Eng 25(9):2049–2062. https://doi.org/10.1109/TKDE.2012.152

    Article  Google Scholar 

  20. Nutanong S, Jacox EH, Samet H (2011) An incremental Hausdorff distance calculation algorithm. Proc VLDB Endow 4(8):506–517. https://doi.org/10.14778/2002974.2002978

  21. Ranu S, Deepak P, Telang AD et al (2015) Indexing and matching trajectories under inconsistent sampling rates. In: 2015 IEEE 31st international conference on data engineering. IEEE, pp 999–1010. https://doi.org/10.1109/ICDE.2015.7113351

  22. Shang Z, Li G, Bao Z (2018) Dita: distributed in-memory trajectory analytics. In: Proceedings of the 2018 international conference on management of data (SIGMOD’18). Association for Computing Machinery, New York, NY, USA, pp 725–740. https://doi.org/10.1145/3183713.3183743

  23. Su H, Liu S, Zheng B et al (2020) A survey of trajectory distance measures and performance evaluation. VLDB J 29(1):3–32. https://doi.org/10.1007/s00778-019-00574-9

    Article  Google Scholar 

  24. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C et al (eds) Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 3104–3112. https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html

  25. Xie D, Li F, Yao B et al (2016) Simba: Efficient in-memory spatial analytics. In: Proceedings of the 2016 international conference on management of data (SIGMOD’16). Association for Computing Machinery, New York, NY, USA, pp 1071–1085. https://doi.org/10.1145/2882903.2915237

  26. Xie D, Phillips JM (2017) Distributed trajectory similarity search. Proc VLDB Endow 10(11):1478–1489. https://doi.org/10.14778/3137628.3137655

    Article  Google Scholar 

  27. Yao D, Cong G, Zhang C et al (2019) Computing trajectory similarity in linear time: a generic seed-guided neural metric learning approach. In: 35th IEEE international conference on data engineering (ICDE 2019), Macao, China, April 8–11, 2019, pp 1358–1369. https://doi.org/10.1109/ICDE.2019.00123

  28. Yi B, Jagadish HV, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Urban SD, Bertino E (eds) Proceedings of the fourteenth international conference on data engineering, Orlando, Florida, USA, February 23–27, 1998, pp 201–208. https://doi.org/10.1109/ICDE.1998.655778

  29. Yuan H, Li G (2019) Distributed in-memory trajectory similarity search and join on road network. In: ICDE, pp 1262–1273. https://doi.org/10.1109/ICDE.2019.00115

  30. Zäschke T, Zimmerli C, Norrie MC (2014) The ph-tree: A space-efficient storage structure and multi-dimensional index. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data (SIGMOD’14). Association for Computing Machinery, New York, NY, USA, pp 397–408. https://doi.org/10.1145/2588555.2588564

  31. Zhang J, Tang B, Yiu ML (2019) Fast trajectory range query with discrete Frechet distance. In: Advances in database technology—22nd international conference on extending database technology (EDBT 2019), Lisbon, Portugal, March 26–29, 2019, pp 634–637. https://doi.org/10.5441/002/edbt.2019.74

  32. Zheng B, Weng L, Zhao X et al (2021) Repose: distributed top-k trajectory similarity search with local reference point tries. In: 2021 IEEE 37th international conference on data engineering (ICDE), pp 708–719. https://doi.org/10.1109/ICDE51399.2021.00067

  33. Zhong RY, Huang GQ, Lan S et al (2015) A big data approach for logistics trajectory discovery from rfid-enabled production data. Int J Prod Econ 165:260–272

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant (Nos. 61802273, 62102277), Postdoctoral Science Foundation of China (No. 2020M681529), Natural Science Foundation of Jiangsu Province (BK20210703), China Science and Technology Plan Project of Suzhou (No. SYG202139), and Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX2_11342, KYCX22_3197).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhua Fang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pan, Z., Chao, P., Fang, J. et al. Garden: a real-time processing framework for continuous top-k trajectory similarity search. Knowl Inf Syst 65, 3777–3805 (2023). https://doi.org/10.1007/s10115-023-01880-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01880-z

Keywords

Navigation