Close

Chamezopoulos Savvas

MSc Electrical and Computer Engineering

MSc Data Science

Download Resume (EN) Download Resume (GR)

About Me

Engineering and computer science has always been my main interest. I always found myself messing around with a tool in my hand, or enjoying myself spending hours on a line of code. My 10-year-long sailing experience has given me the ability to adapt to constant changes and give my best shot at every chance I take. Furthermore, I have improved my repairing and constructing skills and I have developed a fast decision-making ability. Also, working in numerous clubs and cafes as a waiter and a bartender/barista has helped me develop skills such as the ability to interact easily with different types of people, keep my integrity under stress, sell products, and manage a small team.

In this personal page I outline the most important milestones of my carreer, presenting my main education hard-skills. Also, although I used to work during most of the time during my university years, I present the job experiences that shaped me the most, while outlining some of the soft skills each job helped me develop.

Education

University of Amsterdam

Sep 2021 - June 2022

Master's of Science on Information Studies: Data Science

Aristotle University of Thessaloniki

Nov 2014 - July 2020

Bachelor of Engineering with integrated Masters of Engineering in Electrical and Computer Engineering

Languages

Certificate of Proficiency in English

Issued by Univeristy of Michigan

Diplome d'etudes en Langue Française DELF B2

Issued by La Commision National du DELF et du DALF

Greek

Mother Tongue

Publications

Overview of the DagPap24 Shared Task on Detecting Automatically Generated Scientific Paper

Association for Computational Linguistics · Aug 16, 2024

Abstract: This paper provides an overview of the 2024 ACL Scholarly Document Processing workshop shared task on the detection of automatically generated scientific papers. Unlike our previous task, which focused on the binary classification of whether scientific passages were machine-generated or not, one likely use case for text generation technology in scientific writing is to intersperse human-written text with passages of machine-generated text. We frame the detection problem as a multiclass span classification task: given an expert of text, label token spans in the text as human-written or machine-generated. We shared a dataset containing excerpts from human-written papers as well as artificially generated content collected by Elsevier publishing and editorial teams. As a test set, the participants were provided with a corpus of openly accessible human-written as well as generated papers from the same scientific domains of documents. The shared task saw 457 submissions across 28 participating teams and resulted in three published technical reports. We discuss our findings from the shared task in this overview paper.

View Paper

Evaluating approaches to identifying research supporting the United Nations Sustainable Development Goals

Quantitative Science Studies · May 15, 2024

Abstract: The United Nations (UN) Sustainable Development Goals (SDGs) challenge the global community to build a world where no one is left behind. Recognizing that research plays a fundamental part in supporting these goals, attempts have been made to classify research publications according to their relevance in supporting each of the UN’s SDGs. In this paper, we outline the methodology that we followed when mapping research articles to SDGs and which is adopted by Times Higher Education in its Social Impact rankings. We compare our solution with other existing queries and models mapping research papers to SDGs. We also discuss various aspects in which the methodology can be improved and generalized to other types of content apart from research articles. The results presented in this paper are the outcome of the SDG Research Mapping Initiative, which was established as a partnership between the University of Southern Denmark, the Aurora European Universities Alliance (represented by Vrije Universiteit Amsterdam), the University of Auckland, and Elsevier to bring together broad expertise and share best practices on identifying research contributions to UN’s Sustainable Development Goals.

View Paper

Article Classification with Graph Neural Networks and Multigraphs

ELRA and ICCL · May 30, 2023

Abstract: Classifying research output into context-specific label taxonomies is a challenging and relevant downstream task, given the volume of existing and newly published articles. We propose a method to enhance the performance of article classification by enriching simple Graph Neural Networks (GNN) pipelines with edge-heterogeneous graph representations. SciBERT is used for node feature generation to capture higher-order semantics within the articles' textual metadata. Fully supervised transductive node classification experiments are conducted on the Open Graph Benchmark (OGB) ogbn-arxiv dataset and the PubMed diabetes dataset, augmented with additional metadata from Microsoft Academic Graph (MAG) and PubMed Central, respectively. The results demonstrate that edge-heterogeneous graphs consistently improve the performance of all GNN models compared to the edge-homogeneous graphs. The transformed data enable simple and shallow GNN pipelines to achieve results on par with more complex architectures. On ogbn-arxiv, we achieve a top-15 result in the OGB competition with a 2-layer GCN (accuracy 74.61%), being the highest-scoring solution with sub-1 million parameters. On PubMed, we closely trail SOTA GNN architectures using a 2-layer GraphSAGE by including additional co-authorship edges in the graph (accuracy 89.88%)

View Paper

The CLIN33 Shared Task on the Detection of Text Generated by Large Language Models

Computational Linguistics in the Netherlands Journal · Mar 21, 2023

Abstract: The Shared Task for CLIN33 focuses on a relatively novel yet societally relevant task: the detection of text generated by Large Language Models (LLMs). We frame this detection task as a binary classification problem (LLM-generated or not), using test data from up to 6 different domains and text genres for both Dutch and English. Part of this test data was held out entirely from the contestants, including a ”mystery genre” that belonged to an unknown domain (later revealed to be columns). Four teams submitted 11 runs with substantially different models and features. This paper gives an overview of our task setup and contains the evaluation and detailed descriptions of the participating systems. Notably, included in the winning systems are both deep learning models as well as more traditional machine learning models leveraging task-specific feature engineering.

View Paper

Link Prediction in Signed Social Networks: The Case of Bitcoin Users

XIV Balkan Conference on Operational Research · Dec 3, 2020

Abstract: During the last decade, social networks appear in many aspects of modern life. By their nature, these networks are dynamic objects and, thus, questions have been emerged regarding their evolvement during time. The availability of large datasets encoding network information along with the novel machine learning algorithms/solutions have made possible the extensive study of social network properties and structural features. In our work, we study the well-studied link prediction problem which seeks to accurately predict future possible links on the network or missing links due to incomplete data. The most common modelling approach is to represent these networks as graphs, where the nodes represent entities while the edges/links represent the association between entities. We focus on weighted signed social networks and try to predict new edges in a real-world dataset. Specifically, a Bitcoin network is being employed where different users rate the level of trust (on a scale ranging from - 10 to 10, excluding 0) they have in other users. Three different frameworks for representation learning on large graphs have been used; namely, Node2Vec, CDTNE and GraphSage. Following, standard steps involved in supervised learning, the performance of the selected learning functions have been measured using well-known metrics (e.g. accuracy, precision, AUC-score) for each implementation strategy employed in our analysis. All three employed techniques are compared under the aforementioned Bitcoin-related network and the results provide distinct useful insights on the network’s future formation. Additionally, the same methodologies are applied to a well-known dataset citation network of scientific publications (known as CORA dataset) in order to validate further the conclusions of the preceding analysis. Finally, we discuss how the different methodologies regarding network embeddings and link prediction frameworks can be combined effectively to achieve better results regarding the link prediction problem.

View Paper

Experience

Elsevier B.V.

Data Scientist III

Data Scientist as part of Research Data Science. Worked on numerous projects on Research Integrity, NLP, Fraudulent content detection, and submission quality asessment

Elsevier B.V.

Data Scientist II

Data Scientist as part of Research Data Science. Worked on numerous projects on Research Integrity, NLP, Fraudulent content detection, and submission quality asessment

Elsevier B.V.

Intern

This is a thesis internship. In order to complete my thesis for my MSc at UvA, I work for Elsevier B.V., one of the largest scientific publishers in the world. The project, which is "Expanding article classification with graph node classification algorithms", is currently being carried out as part of the RCO projects.

NATO

IT Analyst

During the fulfilment of my military duty, i had the chance to serve at the Nato Rapid Deployable Corps in Greece (NRDC-GR). Serving at such a multicultural and challenging environment really helped me expand my skills, both soft and handcrafting. Being a member of the Information Systems Company of the Greek Army really gave me a sense of beloning in a team, helping and assisting both my fellow privates as well as my superior oficers. During the second half of my duty, i got promoted to lance corporal and so i had the opportunity to lead a small team of 15 privates on the day-to-day tasks. The whole experience was bery rewarding, because while there were a few difficult times, i learned a lot about team-building, team-working, leading and being led.

NAVAGOS Beach Bar

Bartender

Working in Navagos helped me improve both my bartending and barista skills, and drove me to perfect my communication skills. The most important parameter was that the job had to be done during a pandemic period, and so under extreme health-preserving measures.

NATO

IMSS Assistant

Being chosen to participate in NATO's summer student job program as an assistant at the Information Management Support Services (IMSS) of the International Military Staff (IMS) at NATO Headquarters in Brussels was a huge learing experience for me. Firstly, i had a hands-on experience on how an organisation of this magnitude works, and get a view of the complexity of its structure. Secondly, i had the opportunity to meet all different kinds of people, to listen to their stories and to learn that there are more than one paths to success, no matter one's background.

Dasaki coffee and more

Head of Staff/Bartender

Working in a local coffee shop is one of the most usual choices for university students. I worked there during summertime, as it was the only period I had the chance to do so, as engineering university took up most of my time. Nevetheless, i had the opportunity to start at the bottom of the hierachy, as a waiter, and by the end of my time there I as the Head of Staff and a Bartenter at the same time. It was a great experience, as i had the opportunity to manage a small number of staff while bartending at the same time, which helped me improve my multitasking skills and learn to perform under stress.

Projects

Link Prediciton

Link prediction in graph data.This is the project I developed for my undergraduate ECE Thesis.The main idea of the thesis is to develop and compare numerous pre-existing link prediciton techniques, and also combine them in a more efficient way. Three different techniques were chosen:

  • Node2Vec
  • CTDNE
  • GraphSAGE
  • View Project

    Drive and Bike SQL DB

    This is a project developed for the ECCE Auth Course - Database Systems The Developers Team (Chamezopoulos Savvas, Mytilis Konstantinos, Ntokos Konstantinos) designed and built a small database simulating a small automotive dealership whose main products are cars and motorcycles, along with parts and repair services for these products. The DB was built on a local PC on the mariaDB server (v. 5.5.57) through MySQL (v. 5.7.9.)

    View Project

    Portfolio Web Page

    This is the actual code developed for the website you are currently looking at. It is a simple webpage, hosted on github pages. The main goal of this project is to wrap and present my main projects uploaded on github along with my CV and some basic contact info. At the same time, I had the chance to learn a few things on HTML, CSS and javascript.

    View Project

    Assembly- DC Motor Control Open Loop System

    This is a mini project developed for the ECE Auth course - Microprocessors and Peripherals. The developer team (Chamezopoulos Savvas, Mytilis Konstantinos) designed a system that controlls the speed of a given dc motor, while measuring the speed along with the dc-input of the motor in real-time. The microprocessor used was the Atmel mega16 on the stk500 test platform. Two implementations were created, an automatic and a manual one.

    View Project

    AAC-encoder-decoder-in-MATLAB

    This is a project developed for the ECE Auth course - Multimedia Systems and Virtual Reality. It is a signal encoder-decoder according to a simplified AAC protocol. It utilizes the phychoacoustic model as conformity criterion for the quantizer, the Modified Discrete Cosine Transform (MDCT) at the filterbank level, and Huffman encoding for entropy encoding.

    View Project

    Server-Client Messenger

    This is a simple project developed for the ECCE Auth course - Real Time Embedded Systems in C. It is a simple messaging service where the server runs indefinately (in this case in a flashed zsun device) and multiple clients may connect and leave and/or receive messages for/from other clients. The communication is made using the tcp-ip protocol

    View Project

    Parallel PageRank in C

    This is a mini project developed for the ECCE Auth course - Parallel and Distributed Systems in C.It is a simple parallel implementation of the PageRank algorithim developed initially by Google, utilizing the Gauss-Seidel Solve Method. The parallelization is made using OpenMP, and a serial implementation of the algorithm is also included.

    View Project

    Mean Shift in CUDA

    This is a mini project developed for the ECCE Auth course - Parallel and Distributed Systems. It is a simple implementation of the mean shift algorithm in CUDA

    View Project

    KNN search in C

    This is a mini project developed for the ECCE Auth course - Parallel and Distrubuted Systems. It include various implementations of the famous KNN Search algorithm:

  • Serial (Single Process-Single Thread)
  • Parallel (Single Process-Multiple Threads)
  • Serial Distributed (Multiple Process- Single Thread/Process), Blocking Comms
  • Serial Distributed (Multiple Process- Single Thread/Process), Non-Blocking Comms
  • Parallel Distributed (Multiple Process- Multiple Thread/Process), Blocking Comms
  • Parallel Distributed (Multiple Process- Multiple Thread/Process), Non-Blocking Comms
  • The parallelization was made using OpenMP and the distribution using MPI

    View Project

    Parallel Bitonic Sort Implementations in C

    This is a mini project developed for the ECCE Auth course - Parralel and Distrubuted Systems. It includes 5 different implementations of the Bitonic Sort Algorithm, where the input size is of type 2^n

    Implementations:

  • Pthreads: Recursive
  • OpenMP: Recursive && Imperative
  • CilkPlus: Recursive && Imperative
  • View Project

    Custom Command Shell in Linux

    This is a mini project developed for the ECCE Auth course - Operational Systems. It is a mini custom shell that can interpeter and execute bash commands in a linux environment and can function in 2 ways, it can read commands straight from the command line or it can execute a bash script.

    View Project

    Java Socket Programming

    This is a project developed for the ECCE Auth course - Computer Networks II. It is a simple progam demonstrating online communications using java and sockets. It is specifically designed to work with an in-campus server and to respond to the specific instructions designed for that partiqular server. Feel free to explore the code! :)

    View Project

    Serial Comms in Java

    This is a project developed for my undergraduate course in ECCE auth - Computer Networks I. It is a simple program demonstrating serial communications between a PC and a server located in the uni. The task had 4 main tasks:

  • echo: receive simple echo packages and time the download time for a given inteval
  • image: receive and save 2 types of image data, one with and one without errors
  • gps: receive, decode and save gps points and afterwards use these coordinates to receive an image of these locations
  • ackNack: receive and time download of randomly received packets some of which may contain errors
  • View Project

    Arduino_temp_and_proximity_sensors

    This is a small project developed for the ECE-Auth class Microprocessors and Peripherals. It is a simple program built for the arduino platform that measures the temperature in a room. When a user comes close to the module, the last measured temperature along with the mean of the 24 last temperatures measured are displayed on the LCD screen. An ultrasonic sensor is used to measure the distance of the user from the module. It also supports led warning lights that notify the user in case of a low (-er than what we defined) temp (green), a high temp (red) and a warning led (yellow)

    The modules used:

  • Arduino Rev3
  • DS18B20 Temperature Sensor Module for Arduino
  • HC-SR04 Ultrasonic Module Distance for Arduino
  • LCD Display 16x2 Module HD44780
  • View Project

    Programming Languages Experience

    Skills - Talents - Sports

    Contact