Data Science
Accreditations
Check here the detailed study plan
Note: For the 4 mandatory Units lectured in the 1st semester (Big Data Management, Data Science Methodologies and Technologies, Prediction Models, and Pattern Recognition), there is the possibility for international students to enroll in these Units, which implies that there is a possibility that these Units might be lectured in the English language.
Programme Structure for 2024/2025
Curricular Courses | Credits | |
---|---|---|
Data Driven Strategy Optimization
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Time Series Analysis and Forecasting
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Deep Learning for Computer Vision
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Bayesian Modelling
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Big Data Processing and Modeling
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Text Mining for Data Science
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Advanced Network Analysis
6.0 ECTS
|
Scholar Group > Paths > Holders of a 1st Cycle in Data Science or related | 6.0 |
Advanced Distributed Databases
6.0 ECTS
|
Scholar Group > Paths > Holders of a 1st Cycle in Data Science or related | 6.0 |
Business Analytics Fundamentals
6.0 ECTS
|
Scholar Group > Paths > Holders of a 1st Cycle in Data Science or related | 6.0 |
Big Data Management
6.0 ECTS
|
Scholar Group > Paths > Holders of 1st Cycle in Other Areas | 6.0 |
Data Science Methodologies and Technologies
6.0 ECTS
|
Scholar Group > Paths > Holders of 1st Cycle in Other Areas | 6.0 |
Prediction Models
6.0 ECTS
|
Scholar Group > Paths > Holders of 1st Cycle in Other Areas | 6.0 |
Pattern Recognition
6.0 ECTS
|
Scholar Group > Paths > Holders of 1st Cycle in Other Areas | 6.0 |
Ciberlaw
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Project Design for Data Science
6.0 ECTS
|
Scholar Group > Common Branch | 6.0 |
Dissertation in Data Science
42.0 ECTS
|
Final Work | 42.0 |
Master Project in Data Science
42.0 ECTS
|
Final Work | 42.0 |
Data Driven Strategy Optimization
LG1. Understand data-driven decision-making
LG2. Learn to use dynamic optimization and reinforcement learning algorithms adequately
LG3. Apply and evaluate reinforcement learning algorithms for real situations
LG4. Gain new knowledge/practice in Python
1. Data-driven strategies and their implementation in organizations
2. Review of basic concepts in statistics
3. Markov process, dynamic optimization, Bellman Equation
4. Environment, agents, strategy, actions, loss and gains, experience-based learning
5. Reinforcement learning algorithms: Q-learning, Multi-Armed Bandits, value and Policy Iteration
6. Examples and Case studies
First-period assessment consists of the following:
1. Evaluation during the semester:
a). Mid-term quiz - 20% of final grade (min 10 points)
b). Group work/project with individual oral presentation - 80% (70% + 10%) of final grade (min 10 points)
Requires a minimum of 10 points to get approval.
ou
2). Individual project: 100% of final grade. Requires a minimum of 10 points for approval.
Second-period assessment:
- Individual project: 100% of final grade. Requires a minimum of 10 points for approval.
Title: (1). Diana Mendes, (2024), Slides e Notebooks (Moodle)
(2). Richard S. Sutton and Andrew G. Barto, (2018), Reinforcement Learning. An Introduction, The MIT Press.
(3). Osborne, P., Singh, K., Taylor, M., (2022), Applying Reinforcement Learning on Real-World Data with Practical Examples in Python, Springer.
Authors:
Reference: null
Year:
Title: (1). Enes Bilgin, (2020), Mastering Reinforcement Learning with Python, Packt.
(2). Chan, L., Hogaboam, L., Cao, R., (2022), Applied Artificial Intelligence in Business, Springer.
Authors:
Reference: null
Year:
Time Series Analysis and Forecasting
At the end of this learning unit's term, the student must be able to:
LG1. Recognize and apply the classical time series models;
LG2. Recognize and apply ARIMA and GARCH models;
LG3. Recognize and apply multivariate time series models;
LG4. Recognize and apply Machine Learning algorithms (neural networks) for time series forecasting /trading.
LG5. Basic programming and computation with R and Python
LG6. Application of the studied concepts: information and value extraction from real-world data.
P1. Time series (2 lectures)
P1.1. Basic concepts
P1.2. Trends and seasonality
P2. Introduction to univariate stochastic time series models (4 lectures)
P2.1. Stationarity, unit root tests
P2.2. ARMA/ARIMA models
P2.3. Residuals assumptions, diagnoses tests
P2.4. Volatility, risk, ARCH/GARCH models
P2.5. Forecasting, measuring the forecast accuracy
P3. Introduction to multivariate time series models (2 lectures)
P3.1. VAR/VECM models
P3.2. Cointegration analysis and applications
P3.3. Forecasting
P4. Machine (Deep) Learning (6 lectures)
P4.1. Neural networks for time series
P4.2. LSTM, forecasting and trading
P5. Programming/computing with Python
P6. Application of the studied concepts: information and value extraction from real-world data (2 lectures)
The following learning methodologies (LM) will be used:
TM1. Expositional, to the presentation of the theoretical reference frames
TM2. Participative, with analysis of scientific papers
TM3. Active, with the realization of group work;
TM4. Experimental, in computer laboratories, performing analyzes on real data
TM5. Self-study, related with autonomous work (AW) by the student, as is contemplated in the Class Planning
|
The periodic evaluation includes the realization of:
a) An individual test (60%).
b) A team work (40%).
The periodic evaluation requires that students attend at least 80% of classes. The test is covering the entire topics.
In this type of evaluation, the students have to achieve a minimum grade of 8,5 in the individual test and of 10 in the team work. Otherwise the students should do a final exam (minimum approval score: 10).
Title: Ficheiros (slides e scripts) da UC a disponibilizar no e-learning/Fenix
Yves Hilpisch (2018), Python for Finance, 2nd Edition, O.Reilly Media, Inc.
Tarek A. Atwan, (2022), Time Series Analysis with Python Cookbook, Packt Publishing.
Mills, T.C. (2019), Applied Time Series Analysis: A Practical Guide to Modeling and Forecasting, Academic Press, Elsevier Inc.
Brooks, C., (2019), Introductory econometrics for finance, 4nd ed., Cambridge University Press.
Authors:
Reference: null
Year:
Title: Edward Raff, (2022), Inside Deep Learning: Math, Algorithms, Models, Manning Publications Co.
Louis Owen, (2022), Hyperparameter Tuning with Python, Packt Publishing.
James Ma Weiming, (2019), Mastering Python for Finance: Implement advanced state-of-the-art financial statistical applications using Python, 2nd Edition, Packt Publishing.
Juselius, K., (2006), The Cointegrated VAR Model: Methodology and Applications, Oxford University Press.
Authors:
Reference: null
Year:
Deep Learning for Computer Vision
O1: To know the basic digital image formation process
O2: To represent an image in different color spaces and in the frequency domain
O3: To perform typical image processing operations
O4: To extract low-level characteristics from an image
O5: To implement an automatic learning system based on classic algorithms for image content classification
O6: To know the typical architecture of a convolutional neural network (CNN) and to understand how it works
O7: To solve a medium complexity image classification problem CNNs
O8: To apply knowledge transfer and fine-tuning methodologies based on pre-trained CNNs
O9: To use deep learning algorithms for image objects identification
O10: To know deep learning algorithms for automatic generation of multimedia content
O11: To manipulate images using the OpenCV library
O12: To use the Tensorflow library to develop machine learning applications
C1 - Image acquisition and representation
C2 - Image operation
C3 - Extraction of image features
C4 - Introduction to machine learning
C5 - Artificial neural networks
C6 - Convolutional neural networks
C7 - Knowledge transfer
C8 - Network architectures for detecting and identifying image objects
C9 - Network architectures for automatic content generation
Given the imminently practical nature of the course there is no exam assessment modality - there are only assessment modalities carried out along the semester.
Modality A (requires attendance to at least 60% of the classes)
- Participation in class (20%) - individual, evaluated based on exercises and other activities performed during the classes;
- Challenges (20%) - group work, carried out "at home";
- Project (60%) - carried out in a group, but evaluated individually; includes report and oral discussion.
Modality B (mainly intended for those who cannot attend classes)
- Practical test (40%) - individual, held at the end of the academic term;
- Project (60%) - individual or in a group, but evaluated individually; includes report and oral discussion.
All components have a minimum grade of 7.5 (out of 20) values.
Regardless of the modality followed, the grade for the "Project" component is limited by the individual performance demonstrated in the oral discussion, according to the following rule:
- Very good performance – no limit;
- Good performance – max. of 17 (out of 20) values;
- Sufficient performance – max. of 13 (out of 20) values;
- Poor performance – failed the course.
Project's oral discussions will be set on dates during the normal evaluation seasons.
There is no grade improvement process.
The evaluation process in the special season is identical to the modality B, but in this case the project must be carried out individually.
Title: Tomás Brandão, Materiais da UC disponibilizados na plataforma de e-learning, 2023, -, -
J. Howse, J. Minichino, Learning OpenCV 4 with Python 3, 3rd Edition, Packt Publishing, 2020, -, -
M. Elgendy, Deep Learning for Vision Systems, Manning, 2020, -, -
Authors:
Reference: null
Year:
Title: M. Nixon, A. Aguado, Feature Extraction and Image Processing for Computer Vision, 4th Edition, Academic Press, 2019, -, -
I. Goodsfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016, -, -
Vários, Tutoriais e documentação da bibliotecas OpenCV, -, -, https://opencv.org/
Vários, Tutoriais e documentação da biblioteca Tensorflow, -, -, https://www.tensorflow.org/
R. Szeliski, Computer Vision: Algorithms and Applications, 2nd Edition, Springer, 2021, -, https://szeliski.org/Book/
F. Chollet, Deep Learning with Python, 2nd Edition, Manning, 2021, -, -
Authors:
Reference: null
Year:
Bayesian Modelling
LO1. Characterize the basic concepts of Bayesian modelling
LO2. Apply Bayesian regression, classification and optimization models to support decision making
LO3. Apply the Bayesian approach to statistical learning
PC1. Bayes Theorem and Bayesian paradigm
PC2. Graphical and hierarchical models
PC3. Bayesian inference
PC4. Bayesian optimization
PC5. Bayesian regression and classification
PC6. Bayesian latent factor models
Students may choose either Evaluation during the semester or Final exam.
Assessment throughout the semester:
- group work with minimum grade 8 (50%)
- individual test with minimum grade 8 (50%)
Approval requires a minimum grade of 10.
EXAM:
The Final Exam is a written exam. Students have to achieve a minimum grade of 10 to pass.
Title: Códigos R / python
Vários artigos científicos
Slides aulas
Reich, B. J., S. K. Ghosh (2019), Bayesian Statistical Methods, Boca Raton: Chapman and Hall/CRC
McElreath, R. (2020), Statistical Rethinking: A Bayesian Course with Examples in R and Stan, CRC Press.
Levy, R., Mislevy, R. J. (2016), Bayesian Psychometric Modeling, 1st Edition. Boca Raton: Chapman and Hall/CRC
Kruschke, J. K. (2015), Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. Academic Press / Elsevier.
Authors:
Reference: null
Year:
Title: Durr, O., B. Sick (2020), Probabilistic deep Learning, Manning Publications Co.
Theodoridis, S. (2020),Machine Learning: A Bayesian and Optimization Perspective, Elsevier Ltd.
Martin, O., R. Kumar, J. Lao (2022), Bayesian Modeling and Computation in Python, CRC Press.
Heard, N. (2021), An Introduction to Bayesian Inference, Methods and Computation, Berlin: Springer Cham.
Albert, J., H. Jingchen (2020), Probability and Bayesian Modeling, Boca Raton: CRC Press/Taylor & Francis Group.
Authors:
Reference: null
Year:
Big Data Processing and Modeling
At the end of this course, students should be able:
OA1: to know and understand the principal big data processing platforms
OA2: to understand and know how to apply distributed programming / computing models
OA3: to understand and know the stages (pipeline) of a machine learning big data project
OA4: know how to apply dimensionality reduction techniques
OA5: to apply supervised or unsupervised learning techniques to large scale problems
OA6: to understand and know how to apply techniques for processing data streams in real-time
CP1: Big data platforms
CP2: Machine learning for big data
CP3: Dimensionality reduction
CP4: Large scale supervised/unsupervised learning
CP5: Data stream analysis
CP6: Case studies: pagerank and recommender systems
This course includes the following assessment methods: (1) assessment throughout the semester; (2) assessment by exam.
(1) Assessment throughout the semester
The final grade is made up of:
- Individual written test (70%), with a minimum mark of 8.0;
- Group work (30%).
The group work has a mid-term submission, which will count for 30%, and a submission at the end of the semester, which will count for 70%. Those who do not submit the mid-term portion will automatically be assessed by exam.
The work will include an oral presentation/discussion, and the final grade will be individual.
(2) assessment by exam
The final grade will be based on a single written exam.
Title: - Mining of Massive Datasets, A. Rajaraman, J. Ullman, 2011, Cambridge University Press.
- Big Data: Algorithms, Analytics, and Applications, Kuan-Ching Li et al., Chapman and Hall/CRC, 2015.
- Learning Spark: Lightning-Fast Big Data Analysis, Holden Karau, A. Konwinski, P. Wendell and M. Zaharia, O'Reilly Media, 2015.
- Understanding Deep Learning, Prince, Simon JD., MIT press, 2023.
- Advanced Analytics with Spark: Patterns for Learning from Data at Scale, Sandy Ryza et al., O'Reilly Media, 2017.
- Practical Data Science with Hadoop and Spark: Designing and Building Effective Analytics at Scale, Ofer Mendelevitch, Casey Stella and Douglas Eadline, Addison-wesley, 2016.
Authors:
Reference: null
Year:
Title: - All of Statistics: A concise course in Statistical Inference, L.Wasserman, Springer, 2003.
- The elements of statistical learning, Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Springer, 2001.
-- Deep Learning, Ian Goodfellow and Yoshua Bengio, 2016, MIT Press.
Authors:
Reference: null
Year:
Text Mining for Data Science
OA1. Understand the fundamentals and challenges of Text Mining
OA2. Learn techniques for document preparation, cleaning, and representation
OA3. Apply Natural Language Processing methods
OA4. Classify texts using machine learning
OA5. Practical application of techniques in Text Mining
The learning objectives are aligned with a teaching method that combines theory and practice. Students will acquire a solid theoretical foundation in Text Mining, its challenges, and main techniques. Through practical activities and projects, they will develop skills in preprocessing, modeling, classification, and information extraction from texts. By the end of the course, students will be capable of applying Text Mining methods in real-world contexts, using current tools and resources, preparing them to tackle complex problems in the field of text analysis.
Introduction
CP1: Importance of large quantities of text, challenges and current methods
CP2: Unstructured vs. (semi-)structured information
CP3: Obtaining and filtering information, information extraction and Data Mining
Document Representation
CP4: Document pre-processing
CP5: Feature extraction: terms as features
CP6: Term weighting schemes
CP7: Vector space models
CP8: Similarity measures
Natural Language Processing
CP9: Language models
CP10: Morphology and part-of-speech tagging
CP11: Complex structures: syntactic analysis
CP12: Information extraction
Text Classification
CP13: Introduction to statistical machine learning
CP14: Evaluation
CP15: Generative classifiers
CP16: Discriminative classifiers
CP17: Unsupervised learning
CP18: Text Mining Resources
Case Study
CP19: Sentiment analysis
CP20: Topic classification and identification
This course uses only assessment throughout the semester and does not include exams.
Assessment components:
a) TESTS (2 mini-tests: 5% each, final test: 40%), taken during the course period;
b) PROJECT (50%).
The TESTS grade can be replaced by a written test to be taken in the assessment period corresponding to the 1st season, 2nd season or special season (Art. 14 of the RGACC).
The PROJECT grade is limited to the TEST grade + 6 points.
Students may improve their grade in the TESTS component by taking a written test during the assessment period corresponding to the 1st season. Students wishing to do so must inform the teachers as soon as the assessment throughout the semester marks are published.
Title: * Machine Learning for Text (2018). Charu C. Aggarwal. https://doi.org/10.1007/978-3-319-73531- 3
* An Introduction to Text Mining: Research Design, Data Collection, and Analysis 1st Edition (October 11, 2017). Gabe Ignatow, Rada F. Mihalcea. SAGE Publications. https://methods.sagepub.com/book/an-introduction-to-text-mining
* Speech and Language Processing (3rd ed. draft, 2023), Dan Jurafsky and James H. Martin. Conteúdo disponível em: https://web.stanford.edu/~jurafsky/slp3/
Authors:
Reference: null
Year:
Title: * Natural Language Processing for Social Media, Second Edition. Synthesis Lectures on Human Language Technologies. Morgan & Claypool, 2017. Atefeh Farzindar and Diana Inkpen. https://link.springer.com/book/10.1007/978-3-031-02167-1
* Jacob Eisenstein. Introduction to Natural Language Processing. Adaptive Computation and Machine Learning. The MIT Press, 2019. https://mitpress.mit.edu/9780262042840/introduction-to-natural-language-processing/
Authors:
Reference: null
Year:
Advanced Network Analysis
After successfully attending the curricular unit, students should be able to:
OA1. Know the fundamental concepts of network science
OA2. Know the essential metrics and methods for describing and analyzing networks
OA3. Know how to use network analysis and visualization software
OA4. Know how to collect data, analyze and model networks
OA5. Know how to analyze diffusion processes in networks
OA6. Implement a network analysis solution to solve a given problem.
CP1. Introduction to the notion of network and Network Science
CP2. Software for network analysis
CP3. Graphs and network metrics
CP4. Static network models
CP5. Power laws and scale-free networks
CP6. Dynamic network models
CP7. Strategic network models
CP8. Processes in networks, percolation, diffusion and research
CP9. Robustness and resilience
CP10. Communities
CP11. Higher order networks and temporal networks
Given the practical nature of the contents, the assessment will encompass a project. Its subject should be aligned with all or part of the syllabus.
Exercises in class (10%).
Project (90%), including teamwork (report and software: 40%, and oral exam: 50%).
All components of the project - proposal, report, software, and oral exam, are mandatory. The minimal classification for each component is 10 on a scale of 0 to 20.
There will be a unique deadline for submitting the project, except for students accepted to the special period of assessment, that will be allowed to submit during that period.
Presence in class is not mandatory.
There is no final exam.
Students aiming to improve their classification can submit a new project in the following scholar year.
Title: Mark Newman , ?Networks?, second edition, Ed. Oxford University Press, 2020
Albert-Laszlo Barabasi, ?Network Science?, Ed. Cambridge University Press, 2016
Available online at http://networksciencebook.com
Authors:
Reference: null
Year:
Advanced Distributed Databases
This course aims to enhance students' understanding of distributed database management systems (DBMS). It focuses on providing practical skills in designing, implementing and managing these databases, considering challenges such as replication and fragmentation. The curricular unit highlights the importance of guaranteeing the consistency and durability of data in distributed environments, as well as the efficient integration of multiple databases. Finally, it seeks to encourage students to have a critical and analytical view of future trends and innovations in this field.
1. Introduction to Distributed Database Management Systems (DBMS)
2. Distributed Database Project
3. Distributed Data Control
4. Distributed Transaction Processing
5. Data Replication
6. Database Integration
Given its eminently practical nature, the UC does not provide an assessment by exam.
Therefore, the evaluation will take place in the following ways:
1st season:
1. [60%] Group work with individual presentation and discussion* (min. 10 points)
2. [40%] Written test (min. 8 marks)
* individual discussion is decisive, as poor performance may result in failure in the UC, regardless of the quality of the group work delivered.
2nd season and Special Season:
3. [60%] Individual work without presentation or discussion (min. 10 points)
4. [40%] Written test (min. 8 points)
Title: • M. Tamer Ozsu and Patrick Valduriez. (2019). Principles of Distributed Database Systems (4th. ed.). Springer Publishing Company, Incorporated.
• White, Tom. (2015). Hadoop: The Definitive Guide (4th. ed.). O'Reilly Media, Inc. ISBN: 9781491901632
Authors:
Reference: null
Year:
Title: • Moniruzzaman, A B M & Hossain, Syed. (2013). NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison. Int J Database Theor Appl. 6.
• Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. (2006). Bigtable: a distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7 (OSDI '06). USENIX Association, USA, 15.
Authors:
Reference: null
Year:
Business Analytics Fundamentals
LO1. At the end of the CU, each student should have acquired the necessary skills to understand how to use big data and perform data analysis to out-compete traditional companies in their industries.
LO2. Must also be able to define and implement analytical reports and dashboards, considering basic ETL processes, advanced analytical modeling and effective data visualization.
LO3. Finally, each student must develop soft skills, including Teamwork and Collaboration, Communication, Agile and Critical Thinking.
P1. Data-driven decision making.
P2. Types of Analytics.
P3. Data processing, modeling and visualization.
P4. Effective Business Presentation / communication; ability to explain complex analytical models and results.
P5. Power BI Analytics Platform.
1st Sitting:
- Written work, in groups (25%, minimum of 10 points in the final classification) - LO 1, 2, 3
- Individual laboratory project with digital presentation and oral discussion (75%, minimum of 10 points in the final classification) - LO 1, 2
2nd Sitting:
- Exam (100%, minimum grade of 10 points) - LO 1, 2, 3
Scale: 0-20 points.
Title: Aspin, A., Pro Power BI Desktop: Self-Service Analytics and Data Visualization for the Power User, 2020, 3rd ed. Edition, Apress.,
Microsoft, Microsoft Learn Power BI, n.a., Microsoft, https://learn.microsoft.com/en-us/training/powerplatform/power-bi
Albright, S. & Winston, W., Business Analytics: Data Analysis & Decision Making, 2019, 7th Edition, South-Western College Pub,
Berthold, M.R., Borgelt, C., Höppner, F., Klawonn, F. & Silipo, R., Guide to Intelligent Data Science: How to Intelligently Make Use of Real Data, 2020, 2nd Edition, Springer International Publishing,
Knaflic, C. N., Storytelling com dados: um Guia Sobre Visualização de Dados Para Profissionais de Negócios, 2019, Alta Books,
Authors:
Reference: null
Year:
Title: McCandless, D., Knowledge is Beautiful, 2014, William Collins,
Bahga, A. & Madisetti, V., Big Data Science & Analytics: A Hands-On Approach, 2016, VPT,
Meier, M., Baldwin, D., & Strachnyi, K., Mastering Tableau 2021: Implement advanced business intelligence techniques and analytics with Tableau, 2021, 3rd Edition, Packt.
Authors:
Reference: null
Year:
Big Data Management
1. Manipulate NoSQL Databases using JSON;
2. Implement distributed and fault-tolerant data storage solutions;
3. Data migration between Databases;
4. Design and extract information from a multidimensional Data Warehouse;
5. Develop soft skills, namely Problem Solving, Teamwork and Collaboration and Critical Observation (achieved via assessment process).
1. Relational Databases revision and Advanced (aggregated) SQL Queries in Mysql;
2. Introduction to No SQL Databases of databases implementation in MongoDB;
3. Mapping between Relational Databases and Document Databases;
4. Data extracting using JSON;
5. Redundancy and Data Distribution to manage fault tolerance and large information volume;
6. Data migration between different storage systems;
7. Introduction to data warehouse technology;
8. Data processing and integration to populate a Data Warehouse;
9. Information Extraction from a Data Warehouse (Querying and Reporting).
Assessment throughout the semester is done through a written test (minimum grade 7.5), 60% of the grade and a group project, 40% of the grade. Alternatively, there is assessment by exam.
BibliographyTitle: 2019,Andreas Meier , Michael Kaufmann SQL & NoSQL Databases
Models, Languages, Consistency Options and Architectures for Big Data Management, Springer
MongoDb Homepage[Text Wrapping Break]Golfarelli, M., Rizzi, S., Data Warehouse Design: Modern Principles and Methodologies, McGraw-Hill Osborne Media; 1st Edition, May 26, 2009.
Damas, L. SQL - Structured Query Language " FCA Editora de Informática, 2005 (II);
Date, C.J. "An introduction to Database Systems" Addison-Wesley Publishing Company, sexta edição, 1995 (I.2, I.3, I.4, II);
NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison, A B M Moniruzzaman,?Syed Akhter Hossain, 2013 (https://arxiv.org/abs/1307.0191)
Authors:
Reference: null
Year:
Title: -
Authors:
Reference: null
Year:
Data Science Methodologies and Technologies
On successful completion of this course, each student will be able to:
LO1. Define fundamental concepts in Data Science.
LO2. Explain the tasks of a Data Science project and the types of analyses that can be produced.
LO3. Define the project methodologies that exist in Data Science and define which project plan suits the context and tasks of a given problem.
LO4. Explain the concepts of Artificial Neural Network and Hyperparameter Optimisation.
The programme contents (CP) are as follows:
CP1: Fundamental concepts and definitions in Data Science.
CP2: Discussion of the ethical and regulatory aspects of data use and processing.
CP3: Data science project methodologies: what they are, what they consist of and how to apply them.
CP4: Basic preparation of structured data: cleaning and inputting.
CP5: Artificial Neural Networks: perceptron, MLP, backpropagation and hyperparameter optimisation.
This is a ‘hands-on’ curricular unit and assessment should preferably take place over (throughout) the semester, with the development of a group assignment (project).
This work will be presented and there will be a discussion led by the teaching team (presentation with a weight of 20% + discussion with a weight of 20% + report with a weight of 30%) (minimum mark: 10 points).
There will also be an individual test (30% weighting, with a minimum mark of 10).
In the event that the student is justifiably unable to carry out the assessment grading during the semester, he or she may pass the exam in the second examination period by presenting and defending an individual assignment (100% of the grade, with a minimum mark of 10). If this is the case, it must be agreed upon between the Coordinator and the student before the end of the course's regular classes.
Title: Roiger, R. J. (2020). Just enough R! An interactive approach to machine learning and analytics. CRC Press.
Boehmke, B.; Greenwell, R. (2020). Hands-on Machine Learning with R. CRC Press.
Sharda, R., Delen, D., Turban, E., Aronson, J., & Liang, T. P. (2014). Business Intelligence and Analytics: Systems for Decision Support-(Required). Prentice Hall.
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
Provost, F., & Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking. " O'Reilly Media, Inc.".
Authors:
Reference: null
Year:
Title: Voeneky, S., Kellmeyer, P., Mueller, O., & Burgard, W. (Eds.). 2022. The Cambridge Handbook of Responsible Artificial Intelligence: Interdisciplinary Perspectives. Cambridge: Cambridge University Press.
Provost, F., & Fawcett, T. 2013. Data Science for Business: What you need to know about data mining and data-analytic thinking. O'Reilly Media, Inc.
Authors:
Reference: null
Year:
Prediction Models
After attending the course, the student will be able to:
LG1: Understanding data analytics: scopes of application and procedures
LG2: Perform data analytics using R
LG3: Evaluate and interpret the data analytics results
Introduction to Machine Learning: supervised methods to prediction and classification.
PC1: INTRODUCTION
1.1. Regression Problems
1.2. Classification Problems
1.3. Training and Test Sets
1.4. Cross Validation
PC 2: Linear Regression
2.1. Simple Linear Regression
2.2. Multiple Linear Regression
2.3. Applications with R
PC3: Logistic Regression
3.1. Simple Logistic Regression
3.2. Multiple Logistic Regression
3.3. Applications with R
PC4: Decision Tree-based Methods
4.1. Decision Trees Algorithms Construction
4.2. Performance Improvement: Bagging and Boosting
4.3. Classification and Regression Trees (CART) Algorithm
4.4. Random Forests
4.5. Applications with R
1st SEASON ASSESSMENT
In the 1st Season, the Curricular Unit is assessed throughout the semester.
EVALUATION THROUGHOUT THE SEMESTER
- Individual Test (40%): minimum mark of 8;
- Group Coursework (60%): written report and code (50%) + oral presentation (10%)
2nd SEASON EVALUATION
In the 2nd Season, the Course is assessed through the completion of an Individual Project (100%): written report and code (80%) + oral discussion (20%)
In both periods, the student may be subject to an oral exam even if the final grade is >= 9,5.
Scale 0-20
Given the eminently practical nature of the course, assessment by final exam is not contemplated.
Title: Hastie, T.; Tibshirani, R., Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer.
Berk, R.A. (2017). Statistical Learning from a Regression Perspective. 2nd ed. Springer.
Boehmke, B.; Greenwell, R. (2020). Hands ? on Machine Learning with R. CRC Press.
Authors:
Reference: null
Year:
Title: Larose, D., Larose, C. (2015). Data Mining and Predictive Analytics. John Wiley & Sons.
Bradley, E.; Hastie, T. (2016). Computer Age Statistical Inference: Algorithms, Evidence and Data Science. Cambridge University Press.
Burger, S. V. (2018). Introduction to Machine Learning with R. O´REILLY.
Roiger, R. J. (2020). Just enough R! An interactive approach to machine learning and analytics. CRC Press.
Anabela Costa, Lectures notes provided by the lecturer of Course, 2024/ 25.
Authors:
Reference: null
Year:
Pattern Recognition
LG1: Characterize unsupervised data analytics
LG2: Use R for unsupervised data analytics
LG3: Evaluate, validate and interpret the results
PC1: Introduction to unsupervised learning methods
PC2: Principal component analysis (PCA)
- Main concepts and steps
- Examples using R
PC3: Non probabilistic clustering techniques
- Hierarchical methods
- Partitioning methods
- Outlier detection using clustering
- Examples using R
PC4. Probabilistic clustering techniques:
- The EM algorithm
- Mixture models
- Examples using R
PC5: Association rules
- Frequent items and association rules
- Apriori algorithm
- Examples using R
Students may choose either Evaluation during the semester or Final exam.
Evaluation during the semester:
- group work with minimum grade 8 (50%)
- individual test with minimum grade 8 (50%)
Approval requires a minimum grade of 10.
EXAM:
The Final Exam is a written exam. Students have to achieve a minimum grade of 10 to pass.
Title: Bouveyron, C., G. Celeux, T. B. Murphy, A. E. Raftery (2019), Model-Based Clustering and Classification for Data Science: With Applications in R, 1st Edition, Cambridge University Press.
James, G., Witten, D., Hastie, T., Tibshirani, R. (2013), An introduction to statistical learning: with applications in R, New York: Springer.
Hastie, T., Tibshirani, R., Friedman, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E. (2014), Multivariate Data Analysis, 7th Edition, Essex, UK: Pearson Education.
Authors:
Reference: null
Year:
Title: Nwanganga, F., Chapple, M. (2020), Practical Machine Learning in R, 1st Edition, Wiley.
Wedel, M., Kamakura, W. A. (2000), Market Segmentation. Conceptual and Methodological Foundations (2nd edition), International Series in Quantitative Marketing. Boston: Kluwer Academic Publishers.
McLachlan, G. J., Peel, D.(2000), Finite Mixture Models. New York: John Wiley & Sons.
Lattin, J., D. Carroll e P. Green (2003), Analyzing Multivariate Data, Pacific Grove, CA: Thomson Learning.
Jolliffe, I. (1986), Principal Component Analysis. New York: Springer-Verlag.
Hennig, C., Meila, M., Murtagh, F., Rocci, R. (eds.) (2016), Handbook of Cluster Analysis, Handbooks of Modern Statistical Methods. Boca Raton: Chapman & Hall/CRC.
Aggarwal, C. C., Reddy, C. K. (eds.) (2014), Data Clustering: Algorithms and Applications. Boca Raton: CRC Press.
Authors:
Reference: null
Year:
Ciberlaw
This CU aims to raise the students? awareness about the relevance of the principles and rules applicable to ICT uses, their significance as an expression of the values that businesses, markets and technological progress itself should accommodate, while seeking to promote students? knowledge acquisition and encouraging their critical perspectives, combining theory and practice relying on analysis and discussion of case studies.
Introduction: THE TIPC and the sources of national law. Importance of European policies. Constitutional principles, freedoms and rights in the 'software age'. CyberSecurity Law. Computer programs: Related rights. Protection of personal data and privacy: the EU General Data Protection Regulation and the Enforcement Act. Emerging challenges: big data, information quality, cybercrime and algorithmic decision making. Meaning of crisis management. Ethics and mechanisms of criminal participation.
The evaluation shall be carried out on the basis of two individual research papers, in which one of them is submitted by oral presentation in the form to be defined (80%). Active participation in classes will be positively valued in the final classification (20%).
BibliographyTitle: -Gonçalves, Maria Eduarda, ?Tensões entre a liberdade de informação e a propriedade intelectual na era digital? in Jorge Bacelar Gouveia e Heraldo de Oliveira Silva (coords.), I Congresso Luso-Brasileiro de Direito, Coimbra, Almedina, 2014, p. 275-295.
-Gonçalves, Maria Eduarda, ?The EU Data Protection Reform and the Challenges of the Big Data. Remaining uncertainties and ways forward?, Information & Communications Technology Law 26 (2), 2017, p. 1-26.
-Gonçalves, Maria Eduarda, Direito da Informação, Novos direitos e modos de regulação na sociedade da informação, Coimbra, Almedina, 2003 (próxima edição programada para 2019).
-Reed, C., Computer Law, 7th Edition, Oxford, Oxford University Press, 2012.
-Revista do IDN ? Nação e Defesa, n.º 133, CiberSegurança.
-MARTINS, José Carlos Lourenço - Gestão de Segurança da Informação e Cibersegurança nas Organizações: Sistema e método, Sílabas & Desafios, outubro de 2021, isbn:9789898842596.
Authors:
Reference: null
Year:
Title: -https://link.springer.com/content/pdf/10.1007/s11292-022-09504-2.pdf
- https://www.academia.edu/39724415/Protocolo_de_Sa%C3%ADda_pol%C3%ADtica_e_plano_no_contexto_da_trilogia_da_Segurança_da_Informação
- https://www.academia.edu/699096/Do_espectro_de_conflitualidade_nas_redes_de_informacao_por_uma_reconstrucao_conceptual_do_terrorismo_no_ciberespaco
- https://www.academia.edu/40494857/Segurança_da_informação_e_cibersegurança_aspetos_práticos_e_legislação
- https://www.academia.edu/699210/CONTRIBUTO_PARA_ESTUDOS_DE_INTELLIGENCE_SOBRE_OS_SETE_ESPAÇOS_DE_CONFLITO_POR_UM_MODELO_HOLÍSTICO_DE_ANÁLISE
-LEVITT, Steven D., DUBNER Stephen J. ? Freakconomics, Penguin, 2005.
-LINDSTROM, Martin ? Brandwashed, 1.ª ed. Gestão Plus, 2012
-GLEICK, James ? Informação, 1.ª ed. Círculo Leitores, 2012.
-AYRES, Ian ? Super Crunches, 1.ª ed. Academia do Livro, 2010.
-Bibliografia complementar / Complementary Bibliography
Authors:
Reference: null
Year:
Project Design for Data Science
OA1. Skill acquisition to define a specific research problem
OA2. Skill acquisition to identify a suitable dataset to answer to the proposed research goal
OA3. Skill acquisition to evaluate and critically discuss the achieved results in the light of the defined problem
OA4. Skill acquisition to conduct a literature review that enables to position the research problem and its relevance
OA5. Skill acquisition of scientific writing
CP1. Framing the research subject
CP2. Defining the research problem and problem
CP3. Conducting literature review
CP4. Defining the scientific body of knowledge
CP5. Identifying and analysing a relevant data source to the research problema
CP6. Critically analyzing the results in Data Science
CP7. Developing scientific writing
1st and 2nd season evaluation: Individual writing of a scientific article and its presentation (100%)
BibliographyTitle: Gregor, S., & Hevner, A. R. (2013). Positioning and presenting design science research for maximum impact. MIS quarterly, 37(2)
Gastel, B., & Day, R. A. (2016). How to write and publish a scientific paper. ABC-CLIO.
Authors:
Reference: null
Year:
Title: Agarwal, R., & Dhar, V. (2014). Big data, data science, and analytics: The opportunity and challenge for IS research.
Hall, S. (2017, June). Practise makes perfect: developing critical thinking and writing skills in undergraduate science students. In Proceedings of the 3rd International Conference on Higher Education Advances (pp. 1044-1051). Editorial Universitat Politècnica de València.
Authors:
Reference: null
Year:
Dissertation in Data Science
Learning goals (LG):
LG1- Independent scientific thought and originality
LG2- Scientific skills
LG3- Logical coherence and scientific argumentation
LG4- Quality of the presentation
Syllabus contents (SC):
SC1- Formulate the starting question
SC2-Identify the relevant literature and elaborate a theoretical and empirical revision
SC3-Formulate the research problem and the hypotheses
SC4- Design a study to test the hypotheses
SC5- Carry out the study
SC6-Analyse and interpret the results
SC7-Elaborate the dissertation plan
SC8-Write the dissertation
A panel of judges in public tests will assess the dissertation, after the supervisor's approval of its conclusion and quality to be presented in public tests. Assessment will be based on the scientific merit of the study and on its theoretical and methodological adequacy.
BibliographyTitle: G. Garson (2001), Guide to Writing Empirical Papers, Theses, and Dissertations, Marcel Dekker Inc
N. Bui, Yvonne (2014). How to write a Master's Thesis, Sage Publications, Inc.
Authors:
Reference:
Year:
Title: Punch, F. Keith (2016), Developing effective research proposals, Sage Publications.
Authors:
Reference:
Year:
Master Project in Data Science
Learning goals (LG):
LG1- Independent scientific thought and originality
LG2- Scientific skills
LG3- Logical coherence and scientific argumentation
LG4- Quality of the presentation
Syllabus contents (SC):
SC1- Formulate the starting question
SC2-Identify the relevant literature and elaborate a theoretical and empirical revision
SC3-Formulate the research problem and the hypotheses
SC4- Design a study to test the hypotheses
SC5- Carry out the study
SC6-Analyse and interpret the results
SC7-Elaborate the Master Project plan
SC8-Write the Master Project
A panel of judges in public tests will assess the Master Project, after the supervisor's approval of its
conclusion and quality to be presented in public tests. Assessment will be based on the scientific merit of the study and on its theoretical and methodological adequacy.
Title: G. Garson (2001), Guide to Writing Empirical Papers, Theses, and Dissertations, Marcel Dekker Inc
N. Bui, Yvonne (2014). How to write a Master's Thesis, Sage Publications, Inc.
Authors:
Reference:
Year:
Title: Punch, F. Keith (2016), Developing effective research proposals, Sage Publications.
Authors:
Reference:
Year:
Accreditations