- Feedback: Please tell us what you thought of this module
- Found a problem? Report it!
- Continue to Module 7: Practices for Data Sharing in Bioinformatics
Learning Objectives
- Identify common bioinformatics database types and what they are used for
- Describe how database versions, release notes, and identifiers affect reproducibility
- Explain how to document and share database queries (including filters and parameters)
- Recognize provenance and metadata needed to reproduce data mining results
Module Content
Complete the content below in order for better understanding and continuity of information.
Reading: Introduction to Databases in a Bioinformatics Context
Reading: Introduction to Databases in a Bioinformatics Context
Estimated time to complete: 5–15 minutes
Video: Database Types
Download Slides (opens in new tab) | Open directly in Yuja (opens in new tab)
Click to expand and watch video
Captions and transcript available within the player.
Estimated time to complete: 10–15 minutes
Placeholder: Replace links with the final slide deck folder and Yuja video.
Reflection: Query Transparency and Reuse
Reflect on the following question(s) before moving on:
- Think of a search you performed on an internet search engine like Google or DuckDuckGo. Would someone else searching on another computer for the same thing be able to retrieve the exact same records?
- Consider the same thing, but a search of PubMed. What information would you need to provide to ensure another person retrieves the same results as you? Is this feasible? Why or why not?
- What could change between database releases that would affect results (IDs, annotations, reference genomes, or curation)?
- What is one simple way you could record queries and filters to help future-you reproduce the analysis?
Estimated time to complete: 5–15 minutes
Reading: NCBI and Databases in Bioinformatics
Reading: NCBI and Databases in Bioinformatics
Estimated time to complete: 15-20 minutes
Activity: Update Publications and their Purpose
Activity: Update Publications and their Purpose
Estimated time to complete: 30-45 minutes
Reflection: Database Funding and Sustainability
Reflect on the following question(s) before moving on:
- Many bioinformatics databases are supported, at least initially, by grants or very strong community support. In what ways can you support the creators of these databases ensure that they continue to receive funding for these resources? (Grants are not always guaranteed!)
- What might happen to a database if it loses funding? What are the potential impacts on the community it serves?
Estimated time to complete: 5–15 minutes
Video: Other Platforms and Storage for Big Data
Download Slides (opens in new tab) | Open directly in Yuja (opens in new tab)
Click to expand and watch video
Captions and transcript available within the player.
Estimated time to complete: 10–15 minutes
Reading: Information Retrieval
Reading: Information Retrieval
Estimated time to complete: 5–15 minutes
Video: Information Management Systems
Download Slides (opens in new tab) | Open directly in Yuja (opens in new tab)
Click to expand and watch video
Captions and transcript available within the player.
Estimated time to complete: 10–15 minutes
Placeholder: Replace links with the final slide deck folder and Yuja video.
Reflection: Information Retrieval versus Database Query
Reflect on the following question(s) before moving on:
- Why might a novice user prefer keyword searching over writing an SQL query?
- How does relevance ranking in IR differ from exact matching in a database query?
- Would you describe PubMed as “just a database”? Why or why not?
- What kinds of biological information are easiest to retrieve with IR methods rather than strict relational queries?
Estimated time to complete: 5–15 minutes
Practice Quiz
Estimated time to complete: 5–10 minutes
Placeholder: Link to your module practice quiz page when ready.
Next Steps
Notes for Educators
You can use this content in your classes! This work is provided under a Creative Commons Attribution– Non Commercial (CC BY-NC) license. You can use, copy, share, or adapt this material for teaching, learning, or other non-commercial purposes.
You can pick and choose what you want from this website, or you can download the entire website's worth of content in ready-to-go Canvas format here: Download as a Canvas CourseWe do ask that you cite this work when you use it: Learn how.
Notes for Students
There's more on Canvas! Find more content and quizzes on the Canvas version of this course. Go back and click “Enroll in the Self-Paced Canvas Course” for more.