Module 1 – Unit 1

Browsing, searching and filtering data, information and digital content

1.1.1. — Introduction

DESCRIPTION OF THE TOPIC

The first Unit aims to provide some activities and tools helping the reader improve the digital skills of browsing, searching and filtering data, information and digital content. The focus is on the use of open data and open datasets in order to follow the current teachers’ needs.

The overwhelming amount of data, information and digital content that exists on the World Wide Web and on the Internet in general, gives the opportunity for everyone to become “owner” of various pieces of knowledge at almost no cost. But is this true? Can we handle all this information? Are we capable of getting the proper and right amount of data for the topic that interests us? Or, even better, are we capable of asking the right question in order to get the proper data and then turn them into information?

The challenge emerging from that situation has now, more than ever before, highlighted the importance of identifying one’s personal information needs in this “chaos” of digital content, along with the need to improve our skills in browsing, searching and filtering this “chaos”. The use of open data by many organizations nowadays has created a lot of new possibilities, not only in creating new knowledge but in facilitating the education process, too. On the other hand, accessing open data and relative datasets is not as easy as it sounds, therefore most of the time additional skills are required by teachers in all school levels.

The activities discussed in the unit highlight to the reader the importance of identifying one’s personal information needs, also proposing a methodology and related tools on building a personal search strategy in order for the teacher to be more effective in dealing with the enormous amount of digital content in the era we live in. The subject of online security is also taken into consideration and a related activity is also proposed. There are also activities that introduce the reader to the area of open data in order to become more familiar in searching and accessing them, to cover his/her information needs.

LEARNING OUTCOMES

By the end of this unit, the users will be able to:

  • identify their educational needs in the field of information and data literacy
  • create and update personal safe search strategies
  • facilitate the use of open data in the educational process

DIGCOMP FRAMEWORK

Competence area 1 (Information and Data Literacy):

1.1 Browsing, searching and filtering data, information and digital content

DIGCOMPEDU FRAMEWORK

Competence area 6 (Facilitating Learner’s digital competence):

6.1 Information and media literacy

REFERENCES

Punie, Y., editor(s), Redecker, C., European Framework for the Digital Competence of Educators: DigCompEdu, EUR 28775 EN, Publications Office of the European Union, Luxembourg, 2017, ISBN 978-92-79-73718-3 (print),978-92-79-73494-6 (pdf), doi:10.2760/178382 (print),10.2760/159770 (online), JRC107466 (https://publications.jrc.ec.europa.eu/repository/handle/JRC107466).

Vuorikari R, Punie Y, Carretero Gomez S and Van Den Brande G. DigComp 2.0: The Digital Competence Framework for Citizens. Update Phase 1: the Conceptual Reference Model. EUR 27948 EN. Luxembourg (Luxembourg): Publications Office of the European Union; 2016. JRC101254 (https://publications.jrc.ec.europa.eu/repository/handle/JRC101254).

1.1.2 — Explorer Level – Activities

1.1.2.2. — Open Data Portals and Platforms

DESCRIPTION OF THE ACTIVITY

A platform is a major piece of software on which smaller pieces of software and content can be run. For open data, the largest platform is the Web. However, lots of other purpose-built software can help you publish open data, or provide interactive tools to help you explore it.

What is an open data platform?

Open data platforms are pieces of software that make it simpler to publish and manage open data on the Web. For publishers, an open data platform provides a pathway to publish data. Platforms guide publishers through the process of publishing data, and offer users consistency and ease of access to open data from around the world.

Key features of open data platforms

Open data platforms make on open data

  • Discoverable
    Open data platforms

    • promote open data to users,
    • allow users to quickly find and reuse relevant open data,
    • are search-optimised to make it easier to find relevant resources and
    • provide data-feeds that can be discovered and indexed across the web
  • Usable in a consistent way
    Open data platforms

    • don’t use a lot of different technologies to discover large amounts of open data,
    • are designed with the user in mind. Their layout and design can make it easier to discover data, while
    • many governments even have standard naming conventions for the Web address, like data.gov.<country top-level domain> (e.g. http://data.gov.gr ).

Approaches to open data platform design

Platforms provide different approaches to publishing and exposing open data. Discover the different approaches below.

Data catalogue

An open data catalogue is a platform that

  • lists datasets on the Web
  • resembles directories.
  • knows
    • what open data exists,
    • what open data is about,
    • where open data is and
    • how to get hold of open data.
  • links users to data that is located somewhere else on the Web
  • offers a consistent way of locating a diverse set of data that is widely scattered.

Data management

Open data management platforms from the publisher’s point of view

  • can play an active role in the way a publisher
    • manages a dataset
    • maintains a dataset
  • can allow publishers to
    • update data directly in the platform and
    • regularly provide updates.

From the user’s point of view

  • searching for open data and
  • manipulating open data

without having to download it can make reusing it a lot easier.

Open data portals

Open data portals are web-based interfaces designed to make it easier to find re-usable information. Like library catalogues, they contain metadata records of datasets published for re-use, i.e. mostly relating to information in the form of raw, numerical data and not to textual documents. In combination with specific search functionalities, they facilitate finding datasets of interest.

European Data Portal

The European Data Portal constitutes the main portal for open data across European countries. It harvests the metadata of Public Sector Information available on public data portals across European countries. Information regarding the provision of data and the benefits of re-using data is also included.

About the European Data Portal

Going beyond the harvesting of metadata, the strategic objective of the European Data Portal is to improve accessibility and increase the value of Open Data. For more information visit https://www.europeandataportal.eu/en/about/european-data-portal

Provides:

  • data catalogue search capability
  • dataset search capability
  • eLearning section
  • Impact & studies on various aspects
  • News & Events
Activity

Explore the European Data Portal and visit the Data section where you can search for datasets applying various filters as file formats, data scope, metadata quality, license of the data.

1st example: Data scope = Europe, Categories = Environment, Metadata Quality= Excellent

How many datasets does it return?

change -> Metadata Quality= Good+

How many datasets does it return now?

Check the number of datasets in different formats.

2nd example: try to search with the Keyword = “honey”

How many datasets does it return?

choose category = “Agriculture, fisheries, forestry and food”

How many datasets does it return?

Experiment with categories, topics and keywords of your interest to get familiar with this important resource for datasets.

Activity

The OpenDataMonitor project is a European index of open data sites and provides a useful reference guide (https://www.opendatamonitor.eu/frontend/web/index.php).

  • Explore the index (or/and the map) to discover what percentage of
    • open licensed data
    • machine readable data
    • available data
    • metadata completeness

is available for different countries?

  • Explore the index (or/and the map) to discover the distribution size of the data.
  • Visit the advanced search capability and filter by your country
    • Check how many data catalogues there are.

(Notice you can select to search either for Data Catalogues or individual Datasets.)

  • Click on the Benchmark link
    • Select two or more countries to compare
    • Select a time period, if you like
    • Investigate the produced comparative graphs on different factors (open licenses, availability and more)

Other Open Data Portals, Dataset Search Engines and Open Resources

References

TOOLS DATA & RESOURCES NEEDED

  • Web Browser (Chrome, Firefox, Edge, Opera, etc.)

TIME REQUIRED

  • 15 minutes: study the material
  • 5 minutes: first activity
  • 10 minutes: second activity

1.1.2.3. — Online Safety Rules

DESCRIPTION OF THE ACTIVITY

Using the Internet and browsing on the World Wide Web may be a risky thing for our privacy and online safety.

How informed are you about safety rules that can protect you in your daily online digital tasks?

Search the web for

  • “Online safety rules” or
  • “Online safety tips” or
  • Similar search query

Pick at least 3 of the top results from the search engine, containing some kind of list with safety rules and tips:

  • Compare the safety tips/rules and check how many of their suggested guidelines are the same
  • Gather all tips in one list (only their title, not the whole information and delete duplication)
  • Create a 3-column table with the following column names “I know and I follow it”, “I know but I do not follow it”, “I didn’t know it” and put every safety tip to the according column
  • Study the two simplest, for you, safety rules that you are not following in your daily tasks and try to adopt

Some online resources for additional study:

TOOLS DATA & RESOURCES NEEDED

  • Web Browser (Chrome, Firefox, Edge, Opera, etc.)
  • Document editor (Word, Libre Write, Google Docs) or paper

TIME REQUIRED

  • 5 minutes: web searching
  • 5 minutes: safety rules list creation
  • 10 minutes: table creation
  • 10 minutes: study chosen safety rules

1.1.3 — Expert Level – Activities

1.1.3.1 — Google Paradigm

1.1.3.2. — Access Data in a Dataset

DESCRIPTION OF THE ACTIVITY

Access and navigate datasets through Tableau Public

If you are not familiar with Tableau, besides the steps described below for our example, you can study the Tutorial from the official website https://help.tableau.com/current/guides/get-started-tutorial/en-us/get-started-tutorial-home.htm or any other tutorial found on the web. If you want your content to be saved, you should create an account on the Tableau site. You can download the software from the official site of Business Intelligence and Analytics Software (tableau.com) – Products – Tableau Public.

  1. Download the  .csv file from https://www.ecdc.europa.eu/en/publications-data/data-national-14-day-notification-rate-covid-19
  2. Connect –> To a File –>Text file (choose All files (*.*) if you cannot find the file) –> select the .csv dataset you downloaded
  3. If your data is not properly presented in tabular form make sure you check “comma” as a separator “Text field Properties… –> Field separator: Comma”

Figure 5: Comma as Text Field Separator

 

  1. From the same position you can “Rename” your Data Source and Sheet if you like. Let’s change it to “COVID-19 Data”

Figure 6: Rename Data Source

  1. You can check the type of data per column

Figure 7: Type of Column Data

  1. You can hide unneeded columns. We will hide country_code, source.

Figure 8: Hide columns

  1. Filtering data, follow “Filters –> Add –> Add… –> Select a field:” and let’s select “United Kingdom”. You can of course make any filtering you like, combination of one or more choices and add more than one filter.

To remove or edit a filter follow “Filters –> Edit –> Select your filter –> Edit… or Remove”

 

Figure 9: Add filtering

Filtering Activity:

  1. Cases and Deaths in France and UK for the 3rd week of 2021
    • Add filter by country: France and UK,
    • Add another filter by year_week: choose “2021-03”
    • How many rows do they appear?
    • Which column shows the cases and deaths up until that week?
    • Do you have the same results with the following photo? (Normally you should because the historic data in a dataset are not meant to be changed)

Figure 10: Filtering data of France and UK for year 2021 and 3rd week

  1. Show deaths in European countries in descending order for the current week if you have the latest dataset, otherwise choose 3rd week of 2021.
    • You need 3 filters (by continent, by indicator and by year_week)
    • Order by cumulative count

Figure 11: Deaths in European countries from COVID-19 until 3rd week of 2021 in descending order

  1. In order to save the current changes, you made and avoid in the future many of the steps to format the raw data from the initial dataset you downloaded, you can export your current data to CSV file format “Data –> Export Data to CSV –> Choose name and location for your CSV file”.

Access and navigate datasets through Microsoft Excel

On the Web you can find a lot of tutorials on Microsoft Excel with a simple search on a search engine, YouTube or Vimeo.

  1. Download the .csv file from https://www.ecdc.europa.eu/en/publications-data/data-national-14-day-notification-rate-covid-19
  2. Open file from Excel and choose All files (*.*) if you cannot find the file. If Excel finds your file format not appropriate or says your file is corrupted, ignore it and click “Yes” to the question “Do you want to open it anyway?”
  3. A wizard will come up to prepare your data. Check in Step 1: “Delimited” and “My data has headers” –> Next –> Step 2: “Comma” –> Next –> Step 3: “Advanced” and choose the right settings to recognise numeric data in case in your country you use differently (concerns mainly column “rate_14_day” in our csv, confirm it’s correct representation), otherwise just click “Finish”.
  4. Rename the sheet you are working on, if you like, to “COVID-19 Data”. Right-Click (1) –> Rename (2) (see the figure below)

Figure 12: Rename Excel Sheet

  1. You can hide unneeded columns. We will hide columns “country_code” and “source”. Right-Click on top of column J (1) –> Hide (2). Do the same for the column of “country_code”.

Figure 13: Hide Column J with header name “Source”

  1. Filtering data in Excel. Let’s filter the data that concerns the United Kingdom. First, one way to ensure that we don’t lose any data while filtering is to click on the upper-left corner of the Excel sheet so all data is selected, see Figure.

Figure 14: Select all data

Then, click “Sort & Filter” –> Filter.

Figure 15: Excel Filtering

Click on “Country” –> Select All (in order all countries to be deselected) –> Scroll down and click “United Kingdom” –> OK

Figure 16: Filtering specific country

The result should be something like the following Figure. Specifically, the data in columns F and I, weekly_count and cumulative_count respectively, should be exactly the same because they are historical data not meant to be altered, except in the case they were not properly calculated in the first place.

Figure 17: Result of filtering

Filtering Exercise:

  1. Cases and Deaths in France and UK for 3rd week of 2021
    • Add filter by country: France and UK,
    • Add another filter by year_week: choose “2021-03”
    • How many rows do they appear?
    • Which column shows the cases and deaths up until that week?

Do you have the same results with the following photo? (Normally you should because the historic data in a dataset are not meant to be changed)

Figure 18: Filtering data of France and UK for year 2021 and 3rd week

  1. Show deaths in European countries in descending order for the current week if you have the latest dataset, otherwise choose 3rd week of 2021.
    • You need 3 filters (by continent, by indicator and by year_week)
    • Select all data (click upper-left corner) –> “Sort & Filter” –> Custom Sort… –> Sort by “cumulative_count” and Order “Largest to Smallest” –> OK

Figure 19: Deaths in European countries from COVID-19 until 3rd week of 2021 in descending order

  1. In order to save the current changes, you have made and avoid in the future many of the steps to format the raw data from the initial dataset you downloaded, you should save your data as an “.xlsx” file.

Reflection

Dealing with open data and open datasets is not an easy thing to do. Digital skills over using spreadsheet software are useful to access and manipulate open datasets. The above activity is a small paradigm of how you can do that for a real-life problem. This activity could include more steps and be more exhaustive but such an activity is beyond the scopes of the course. It’ s main purpose is for the reader to start experimenting with open-data, stop feeling “absolute fear” about them and with some practice to manage to use some simple datasets for his educational purposes.

TOOLS DATA & RESOURCES NEEDED

  • Web Browser
  • Tableau Public
  • Microsoft Excel

TIME REQUIRED

  • 20 minutes for each Tableau/Excel depending on the experience in the use of the software.