02:00
Lecture 1
Cornell University
INFO 2950 - Spring 2024
January 23, 2024
Dr. Benjamin Soltoff
Lecturer in Information Science
Gates Hall 216
Physically interact with at least 2 people sitting around you. Introduce yourselves to each other and share:
02:00
Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and knowledge.
[A]n interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a broad range of application domains1
We’re going to learn to do this in a tidy
way – more on that later!
This is a course on computing applications for data science workflows
Or more like demo for today…
Respond at PollEv.com/soltoff
03:00
https://info2950.infosci.cornell.edu/
All linked from the course website:
Important
Make sure you can access RStudio (Posit) Workbench before lab on Friday.
Prepare: Introduce new content and prepare for lectures by completing the readings
Participate: Attend and actively participate in lectures and labs, office hours, team meetings
Practice: Practice applying statistical concepts and computing with application exercises during lecture, graded for completion
Perform: Put together what you’ve learned to analyze real-world data
Category | Percentage |
---|---|
Exams | 25% |
Homework | 25% |
Project | 25% |
Labs | 15% |
Application Exercises | 10% |
See course syllabus for how the final letter grade will be determined.
I want this course to be accessible to students with all abilities. Please feel free to let me know if there are circumstances affecting your ability to participate in class.
As long as you meet
the prereqs
Only work that is clearly assigned as team work should be completed collaboratively.
Homeworks must be completed individually. You may not directly share answers / code with others, however you are welcome to discuss the problems in general and ask for advice.
Exams must be completed individually. You may not discuss any aspect of the exam with peers.
We are aware that a huge volume of code is available on the web, and many tasks may have solutions posted
Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism, regardless of source
All code must be written by you, the human being
Use generative AI to facilitate, rather than hinder, learning
✅ GAI tools for reference purposes
❌ GAI tools for writing code/analysis
❌ GAI tools for narrative
You are ultimately responsible for the work you turn in; it should reflect your understanding of the course content
Ask if you’re not sure if something violates a policy!