New Skills for the Next Generation of Journalists

2017-1-HU01-KA203-036038

Databases

According to the Encyclopaedia Britannica, a database, also called electronic database, is any collection of data or information that is specifically organised for rapid search and retrieval by a computer. Databases are structured to facilitate the storage, retrieval, modification, and deletion of data in conjunction with various data-processing operations.

Databases can be classified according to their type of content: bibliographic, full text, numeric and images. There are many different kinds of databases: relational database (items are organized as a set of tables with columns and rows), and object-oriented databases, (which allow for the integration of the data and the programming language in the database), distributed database (two or more files located on different sites) or cloud database (built for a virtualized environment).

Databases evolved dramatically since the early 1960’s. In the 1980’s, relational databases became popular, followed by object-oriented databases shortly after. Object-oriented databases represent data in form of objects and classes. In this understanding, an object can consist of properties (state) or entities (behavior). For example, a student can be represented in their university’s database with properties like name, address or date of birth as well as with their entities or behaviors like going to classes, writing exams or paying their fees. In other words, a data set is combined with an object, thus all information is directly available. Objects are brought together in classes, generating a hierarchy of classes and subclasses. The objects are interlinked and can be easily retrieved after they have been saved. Object-oriented database models have advantages: complex data sets can be saved and accessed quickly and easily and work well with object-oriented programming languages like C++, JavaScript or Python. However, object databases are not widely adopted, and the high complexity can cause performance problems.

Recently, NoSQL (non-relational database that allows unstructured data to be stored and manipulated) have become popular. The most current developments are cloud databases and self-driving databases, a term to describe automatized databases that perform operations such as security, back-up and update management without human intervention.