A QUANTITATIVE ANALYSIS OF THE USE OF MICRODATA FOR SEMANTIC ANNOTATIONS ON EDUCATIONAL RESOURCES
Keywords:
semantic web, Microdata, educational resources, Schema.org, LRMI, web standardsAbstract
A current trend in the semantic web is the use of embedded markup formats aimed to semantically enrich web content by making it more understandable to search engines and other applications. The deployment of Microdata as a markup format has increased thanks to the widespread of a controlled vocabulary provided by Schema.org. Recently, a set of properties from the Learning Resource Metadata Initiative (LRMI) specification, which describes educational resources, was adopted by Schema.org. These properties, in addition to those related to accessibility and the license of resources included in Schema.org, would enable search engines to provide more relevant results in searching for educational resources for all users, including users with disabilities. In order to obtain a reliable evaluation of the use of Microdata properties related to the LRMI specification, accessibility, and the license of resources, this research conducted a quantitative analysis of the deployment of these properties in large-scale web corpora covering two consecutive years. The corpora contain hundreds of millions of web pages. The results further our understanding of this deployment in addition to highlighting the pending issues and challenges concerning the use of such properties.
Downloads
References
Sikos, L. F. Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data.
Apress, 2015. DOI: 10.1007/978-1-4842-1049-9.
Meusel, R., Petrovski, P. and Bizer, C. The webdatacommons Microdata, RDFa and microformat dataset
series. In: Proceedings of the 13th International Semantic Web Conference (ISWC '14) - Part I, Riva del
Garda - Trentino, Italy, October 19-23, 2014, pp. 277-292. DOI: 10.1007/978-3-319-11964-9_18.
Guha, R., Brickley, R. and Macbeth, S. Schema.org: Evolution of Structured Data on the Web.
Communications of the ACM, 2016; 59(2): 44-51. DOI: 10.1145/2857274.2857276.
Piedra, N., Chicaiza, J., López-Vargas, J. and Caro, E.T. Seeking Open Educational Resources to Compose
Massive Open Online Courses in Engineering Education: An Approach based on Linked Open Data. Journal
of Universal Computer Science, 2015; 21(5): 679-711. DOI: 10.3217/jucs-021-05-0679.
Navarrete, R. and Luján-Mora, S. (2014) Metadata in Open Educational Resources websites: a review from
the perspective of disabled users’ requirements. In: Proceedings of the 6th International Conference on
Education and New Learning Technologies (EDULEARN), Barcelona, Spain, July 7- 9, 2014, pp. 111-120.
Available at: https://goo.gl/2MV6TA.
Allen, E. and Seaman, J. Opening the Curriculum: Open Educational Resources in U.S. Higher Education.
Babson Survey Research Group, 2014, pp. 29-30.
Yu, L. A Developer’s Guide to the Semantic Web. Springer Berlin Heidelberg, 2014. DOI: 10.1007/978-3-
-43796-4_10.
Navarrete, R. and Luján-Mora, S. Evaluating findability of Open Educational Resources from the perspective
of users with disabilities: A preliminary approach. In: Proceedings of the Second International Conference on
eDemocracy & eGovernment (ICEDEG), Quito, Ecuador, April 8-10, 2015, pp. 112-119. DOI:
1109/ICEDEG.2015.7114457.
UNESCO. World Education Forum, https://en.unesco.org/world-education-forum-2015/incheon-declaration
(2015, accessed September 2017).
World Health Organization. World Report on Disability, https://goo.gl/q88CuW (2011, accessed September
.
United Nations. World Population Ageing, https://goo.gl/g3tU7U (2015, accessed September 2017).
Hawksey, M., Barker, P. and Campbell, L.M. New Approaches to Describing and Discovering Open
Educational Resources. In: Proceedings of OER13: Creating a Virtuous Circle, Nottingham, England, March
-27, 2013. Available at: http://publications.cetis.ac.uk/2013/767.
Haas, K., Mika, P., Tarjan, P. and Blanco, R. Enhanced Results for Web Search. In: Proceedings of the 34th
International ACM SIGIR Conference on Research and Development in Information Retrieval, July 24-28,
, Beijing, China, pp. 725-734. DOI: 10.1145/2009916.2010014.
Meusel, R., Bizer, C. and Paulheim, H. A Web-scale Study of the Adoption and Evolution of the schema.org
Vocabulary over Time. In: Proceedings of the ACM 5th International Conference on Web Intelligence,
Mining and Semantics (WIMS '15), Larnaca, Cyprus, July 13-15, 2015, pp. 1-11. DOI:
1145/2797115.2797124.
Schema.org. What is schema.org, https://schema.org/ (2015, accessed November 2016).
A Quantitative Analysis of the Use of Microdata for Semantic Annotations on Educational Resources
Learning Resource Metadata Initiative. LRMI Version 1.1, http://lrmi.dublincore.net/lrmi-1-1/ (2014,
accessed November 2016).
Common Crawl Foundation. Common Crawl, http://commoncrawl.org/ (accessed November 2016).
Data and Web Science Research Group - University of Manheim. Web Data Commons,
http://webdatacommons.org/ (2013, accessed November 2016).
Taibi, D. and Dietze, S. Towards Embedded Markup of Learning Resources on the Web: An Initial
Quantitative Analysis of LRMI Terms Usage. In: Proceedings of the 25th International Conference
Companion on World Wide Web, Montreal, Canada, April 11-15, 2016, pp. 513-517. DOI:
1145/2872518.2890464.
Sahoo, P., Gadiraju, U., Yu, R., Saha, S. and Dietze, S., Analysing Structured Scholarly Data embedded in
Web Pages, Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD2016), co-located
with the 25th International World Wide Web Conference, Montreal, Canada, April 11, 2016. Available at:
W3C. HTML5, http://www.w3.org/TR/html5/ (2015, accessed November 2016).
W3C. HTML Microdata, https://www.w3.org/TR/Microdata/ (2015, accessed November 2016).
Paulheim, H. What the Adoption of schema.org Tells About Linked Open Data. In: Proceedings of the 5th
International Workshop on Using the Web in the Age of Data (USEWOD '15) and the 2nd International
Workshop on Dataset PROFIling and fEderated Search for Linked Data (PROFILES '15). 2015, Portoroz,
Slovenia, June 1, 2015, pp. 85-90. Available at: http://ceur-ws.org/Vol-1362/PROFILES2015_paper6.pdf.
Bizer, C., Heath, T. and Berners-Lee, T. Linked data-the story so far. International Journal on Semantic Web
and Information Systems, 2009; 5(3): 1-22. DOI: 10.4018/jswis.2009081901.
Fons, T., Penka, J. and Wallis, R. OCLC’s Linked Data Initiative: Using Schema.org to Make Library Data
Relevant on the Web. Information Standards Quarterly Spring/Summer, 2012; 2(3): 1-6. DOI:
3789/isqv24n2-3.2012.05.
Ronallo, J. HTML5 Microdata and Schema.org. Code4Lib, 2012; 16: 1-4. Available at:
http://journal.code4lib.org/articles/6400.
Patel-Schneider, P. F. Analyzing Schema.org. In: Proceedings of the 13th International Semantic Web
Conference (ISWC '14) - Part I, Riva del Garda - Trentino, Italy, October 19-23, 2014, pp. 261-276. DOI:
1007/978-3-319-11964-9_17.
Barker, P. and Campbell, L. Learning Resource Metadata Initiative: using schema.org to describe open
educational resources. In: Proceedings of OpenCourseWare Consortium Global 2014: Open Education for a
Multicultural World, Ljubljana, Slovenia, April 23 - 25, 2014, pp. 1-4. Available at:
http://publications.cetis.org.uk/wp-content/uploads/2014/09/Paper_34-LMRI1.pdf.
Levy, Y. and Ellis, T. J. A systems approach to conduct an effective literature review in support of
information systems research. Informing Science: International Journal of an Emerging Transdiscipline,
; 9(1): 181-212. Available at: https://goo.gl/4tCNPq.
Pastore, S. Website development and web standards in the ubiquitous world: where are we going? WSEAS
Transactions on Computers, 2012; 11(9): 309-318. Available at: https://goo.gl/fDFZpa.
Pohorec, S., Zorman, M. and Kokol, P. Analysis of approaches to structured data on the web. Computer
Standards & Interfaces, 2013; 36(1): 256-262. DOI: 10.14778/2180912.2180920.
Wu, Z., Xu, Y., Zhang, C., Yang, Y. and Ji, Y. (2016) Towards Semantic Web of Things: From Manual to
Semi-automatic Semantic Annotation on Web of Things. In: Proceedings of the 2nd International Conference
(BigCom), Shenyang, China, July 29-31, 2016, pp. 295-308. DOI: 10.1007/978-3-319-42553-5_25.
Hilliker, R. J., Wacker, M. and Nurnberger, A. L. Improving Discovery of and Access to Digital Repository
Contents Using Semantic Web Standards: Columbia University's Academic Commons. Journal of Library
Metadata, 2013; 13(2-3), 80-94. DOI: 10.7916/D86M34RR.
Flotyński, J. and Walczak, K. (2013, September). Microformat and Microdata schemas for interactive 3d web
content. In: Proceedings of the 2013 Federated Conference on Computer Science and Information Systems
(FedCSIS), Kraków, Poland, September 8-11, 2013, pp. 549-556. Available at: https://annalscsis.
org/proceedings/2013/pliks/231.pdf.
Stoll, K. U., Ge, M. and Hepp, M. Understanding the Impact of E-Commerce Software on the Adoption of
Structured Data on the Web. In: 16th International Conference on Business Information Systems, Poznań,
Poland, June 19-20, 2013, pp. 100-112. DOI: 10.1007/978-3-642-38366-3_9.
Sikos, L. F. Advanced (X) HTML5 metadata and semantics for Web 3.0 videos. DESIDOC Journal of
Library & Information Technology, 2011; 31(4): 247-252. DOI: 10.14429/djlit.31.4.1105.
Kutuzov, A. and Ionov, M. Untangling the Semantic Web: Microdata use in Russian video content delivery
sites. In: International Conference on Analysis of Images, Social Networks and Texts (AIST), Yekaterinburg,
Russia, April 10-12, 2014, pp. 274-279. DOI: 10.1007/978-3-319-12580.
Pabitha, P., Vignesh Nandha Kumar, K., Pandurangan, N., Vijayakumar. R. and Rajaram, M. Semantic
Search in Wiki using HTML5 Microdata for Semantic Annotation. International Journal of Computer
Science Issues, 2011; 8(3): 388-394. Available at: https://goo.gl/yVWbr1.
Lars, J. (2012). HTML5, MICRODATA AND SCHEMA.ORG - Towards an Educational Social-semantic
Web for the Rest of Us? In: Proceedings of the 4th International Conference on Computer Supported
Education (CSEDU), Volume 1, Porto, Portugal, April 16-18, 2012, pp. 101-104. DOI:
5220/0003895901010104.
Matosevic, G. The Adoption of Semantic Annotations of Products in Web Shops. International Journal of
Computer and Communication Engineering, 2014; 3(1): 6-10. DOI: 10.7763/IJCCE.2014.V3.282.
Meusel R, Primpeli A, Meilicke C, Paulheim, H. and Bizer, C. Exploiting Microdata Annotations to
Consistently Categorize Product Offers at Web Scale. In: Proceedings of the 16th International Conference
on Electronic Commerce and Web Technologies (EC-Web), Valencia, Spain, September 3-4, 2015, pp. 83-
DOI: 10.1007/978-3-319-27729-5_7.
Hepp, M. The Web of Data for E-Commerce: Schema.org and GoodRelations for Researchers and
Practitioners. In: Proceedings of the 15th International Conference on Engineering the Web in the Big Data
Era, Volume 9114, Rotterdam, The Netherlands, June 23 - 26, 2015, pp. 723-727. DOI: 10.1007/978-3-319-
-3_66.
Ristoski P. and Mika P. Enriching Product Ads with Metadata from HTML Annotations. In: The Semantic
Web. Latest Advances and New Domains, Volume 9678, Heraclion, Crete, Greece, May 29-June 1. 2016,
pp.151-167. DOI: 10.1007/978-3-319-34129-3_10.
Nogales, A., Sicilia, M. A., Sánchez-Alonso, S. and Garcia-Barriocanal, E. Linking from Schema.org
Microdata to the Web of Linked Data: An empirical assessment. Computer Standards & Interfaces, 2016, 45:
-99. DOI: 10.1016/j.csi.2015.12.003.
DiFranzo, D., Erickson, J. S., Gloria, M. J. K. T., Luciano, J. S., McGuinness, D. L. and Hendler, J. The web
observatory extension: facilitating web science collaboration through semantic markup. In: Proceedings of
the 23rd International Conference on World Wide Web, Seoul, Korea, April 7 -10, 2014, pp. 475-480. DOI:
1145/2567948.2576936.
Mika, P. and Potter, T. Metadata statistics for a large web corpus. In: Bizer C, Heath T, Berners-Lee T, et al.
(eds) CEUR Workshop on Linked Data on the Web (LDOW), Lyon, FR, April 16, 2012, pp. 6-10. Available
at: http://ceur-ws.org/Vol-937/ldow2012-inv-paper-1.pdf.
Mühleisen, H. and Bizer, C. Web data commons - Extracting structured data from two large web corpora. In:
Bizer C, Heath T, Berners-Lee T, et al. (eds) CEUR Workshop on Linked Data on the Web (LDOW), Lyon,
France, April 16, 2012, pp. 2-5. Available at: https://goo.gl/eDyi8V.
Bizer, C., Eckert, K., Meusel, R., Mühleisen, H., Schuhmacher, M. and Völker, J. Deployment of RDFa,
Microdata, and Microformats on the Web - A Quantitative Analysis. In: Proceedings of the 12th
International Semantic Web Conference (ISWC 2013) - Proceedings Part II, Sydney, Australia, October 21-
, 2013, pp. 17-32. DOI: 10.1007/978-3-642-41338-4_2.
Mühleisen, H. Vocabulary Usage by Pay-Level Domain, https://goo.gl/787pCn (2015, accessed September
.
Web Data Commons. Download Instructions for the WDC RDFa, Microdata, and Microformats Data Sets,
https://goo.gl/9JBCtH (2014, accessed September 2017).
RDFa, Microdata, Embedded JSON-LD, and Microformats Data Sets - November 2015,
https://goo.gl/rccIRV (2015, accessed October 2016).
W3C. RDF 1.1 N-Quads, https://www.w3.org/TR/n-quads/ (2014, accessed September 2017).
Schema.org. Hosted extension: pending, https://pending.schema.org/ (2015, accessed, September 2017).
DMOZ Internet Directory, http://dmozlive.com/ (accessed June 2017).
Schema org. CreativeWork, http://schema.org/CreativeWork (2015, accessed September 2017).
W3C. Accessible Rich Internet Applications (WAI-ARIA) 1.0, https://www.w3.org/TR/wai-aria (2014,
accessed September 2017).
Navarrete, R. and Luján-Mora, S. Accessibility considerations in Learning Objects and Open Educational
Resources. In: Proceedings of the 6th International Conference of Education, Research and Innovation
(ICERI), Seville, Spain November 18-20, 2013, pp. 521-530. Available at: https://goo.gl/H82jxs.
Navarrete, R. and Luján-Mora, S. Evaluating accessibility of Open Educational Resource website with an
heuristic method. In: Proceedings of the 9th International Technology, Education and Development
Conference (ITHET), Caparica - Lisbon, Portugal June 11-13, 2015, pp. 6402-6412. Available at:
Creative Commons. State of the commons, https://stateof.creativecommons.org/2015/ (2015, accessed
September 2017).