Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 31 Jan 2025

Expanding the BASIL CURE

,
,
,
,
,
,
,
, and
Page Range: 22 – 36
DOI: 10.35459/tbp.2024.000273
Save
Download PDF

ABSTRACT

In the Biochemistry Authentic Scientific Inquiry Lab (BASIL) course-based undergraduate research experience, students use a series of computational (sequence and structure comparison, docking) and wet lab (protein expression, purification, and concentration; sodium dodecyl sulfate-polyacrylamide gel electrophoresis [SDS-PAGE]; enzyme activity and kinetics) modules to predict and test the function of protein structures of unknown function found in the Protein Data Bank and UniProt. BASIL was established in 2015 with a core of 10 faculty members on six campuses, with the support of an educational researcher and doctoral student on a seventh campus. Since that time, the number of participating faculty members and campuses has grown, and we have adapted our curriculum to improve access for all who are interested. We have also expanded our curriculum to include new developments that are appearing in computational approaches to life science research. In this article, we provide a history of BASIL, explain our current approach, describe how we have addressed challenges that have appeared, and describe our curriculum development pipeline and our plans for moving forward in a sustainable and equitable fashion.

I. INTRODUCTION

The Biochemistry Authentic Scientific Inquiry Lab (BASIL) began in a research laboratory where students were using computational and wet lab methods to predict protein function. Two of the students developed ProMOL, which is a plug-in for PyMOL (Schrödinger, LLC, Cambridge, MA) that compared any three-dimensional (3D) protein structure with a library of enzyme-active sites from the Mechanism and Catalytic Site Atlas and suggested a function for the query protein (1–4). Other research students then began comparing results they obtained by using our software with sequence alignment results from the Basic Local Alignment Search Tool (BLAST) and Pfam, and additional research students began exploring docking programs (AutoDock, PyRx) (5, 6). At one point, before the course-based undergraduate research experience’s acronym CURE was used in the science, technology, engineering, and math vernacular, it seemed natural to transition this project into the teaching laboratory. With support from the National Science Foundation (NSF), BASIL was born and then named as such ∼3 years later.

The initial BASIL team came together in stages. The original research group included Herbert Bernstein, Paul Craig, and Jeff Mills at Rochester Institute of Technology (RIT). Craig invited colleagues from his undergraduate (including Robert Stewart from Oral Roberts University, Tulsa, Oklahoma) and graduate (including Colette Daubner from St. Mary’s University, San Antonio, Texas) school experiences. Our initial application to NSF was not funded, but the organization provided valuable feedback, urging us to recruit more faculty members. In response, Craig presented a poster about the project at the 2014 national meeting of the American Society for Biochemistry and Molecular Biology, which attracted Mike Pikaart (Hope College, Holland, Michigan), Rebecca Roberts (Ursinus College, Collegeville, Pennsylvania), and Anya Goodman (California Polytechnic State University, San Luis Obispo, California) to the project. Roberts then recruited Julia Koeppe, who was an Ursinus College faculty member at the time; likewise, Goodman recruited Ashley Ringer McDonald from California Polytechnic State University. This team successfully obtained NSF Improving Undergraduate STEM Education (IUSE) funding in 2015. In Fall 2015, we created five computational and six wet lab modules, which are hosted on the BASIL website (Table 1).

Table 1.BASIL modules and learning objectives.
Table 1.

The entire curriculum requires ∼42 h of in-laboratory time to complete and uses basic laboratory equipment; implementation costs are similar to other biochemistry laboratory curricula. Student modules suitable for a teaching laboratory were created and provided to the community, which allowed colleagues to easily customize the experiments to suit their learning outcomes. We used backward design in creating the modules, focusing first on the learning outcomes we hoped our students would attain (Table 1).

Each student module contains the following sections: learning goals and objectives, introduction, protocol, interpreting results, and references. Assessment questions for each module are aligned with the learning objectives. Instructor resources have also been created and include an instructor guide featuring prerequisite student knowledge, prerequisite instructor knowledge, a teaching discussion, experimental design considerations, laboratory preparation notes, and data interpretation considerations. It took 2 years for the curriculum to mature because conversations and efforts initially happened during 1-h video conferences. In Summer 2017, members of the BASIL team met for an intense writing process under the leadership of Ringer McDonald, a computational chemist who helped create an orderly and uniform online curriculum in GitHub (Microsoft Corp., Seattle, WA). Subsequent changes to the curriculum are described in section III.

Over the past 8 years, BASIL has been implemented in diverse higher education settings, from liberal arts colleges to research universities, in laboratories ranging from 8 to 24 students. Departments offering the BASIL experience include chemistry, biology, biochemistry, pharmaceutical sciences, and computer science. In addition, BASIL has also been performed successfully at the high school level.

The curriculum has been used in various ways on different campuses. Some have used the full BASIL curriculum in the order the 11 modules are written, some use only part of the curriculum or use it in a different order, and others have used individual modules as stand-alone activities (7). The computational modules require less time, ranging from a BLAST investigation (∼1 h) to a molecular docking study (∼3 h), and some instructors combine two or more modules in a single 3-h laboratory period. The wet lab modules vary greatly, and individual modules can span several class meetings. The instructor resources provide insights into implementation modifications that can reduce the in-laboratory time required for the students; having a teaching assistant or a laboratory-preparation person can also reduce student in-laboratory time. Some instructors have also introduced BASIL as an independent research project.

In a typical BASIL implementation, teams of two to three students are provided with a Protein Data Bank (PDB) or UniProt entry that is listed as having unknown function, and they study it over the course of a semester by using the BASIL modules. Our initial efforts focused on putative hydrolases that had been identified from the list of ∼4,000 experimentally determined protein structures of unknown function in the PDB. As our work expanded to more campuses, we quickly realized that the diversity of instrumentation, academic schedule, and faculty interest required that we ensure that the BASIL curriculum can be adapted to a wide variety of situations. Section III provides insights into the changes we have made and the ways in which these changes have been implemented and shared.

II. RESULTS OF EARLY BASIL IMPLEMENTATIONS

The BASIL community provided honest feedback about motivation and challenges they encountered with the curriculum (8). When asked why they joined BASIL, users responded on Likert scale to a range of options, including “I thought it would look good for promotion and tenure,” “I was bored with the previous curriculum,” and “I think it’s the best way to teach these techniques.” BASIL practitioners have implemented some or all of the wet lab and computational modules in laboratory sizes ranging from 6 to 25 students. In addition to the 11 core BASIL modules, users have developed modules that focused on techniques (e.g., pipetting, buffer design and preparation, western blots, plasmid map analysis, restriction digests) and enzymes other than serine hydrolases–kinases and Nudix hydrolases. They indicated that some issues had little effect on implementation of the BASIL curriculum (e.g., lack of alignment between lecture and laboratory, lack of institutional support), whereas others proved to be more challenging (e.g., student frustration with lack of “success,” limited time with a new curriculum).

Early research in the BASIL CURE focused on identifying anticipated learning outcomes (ALOs) for students engaged in biochemistry CUREs (9). A five-step process for identifying course-based undergraduate research abilities was identified:

1) a content analysis of the lab protocols, 2) an open-ended survey about how scientists conduct similar research to what the students do when performing the lab protocols, 3) a follow-up semi-structured interview, 4) an alignment check of the generated ability statements across the previous steps in the process, and 5) a Likert survey to prioritize the identified Course-based Undergraduate Research Abilities (9). This approach was used to identify 43 top-, medium-, and low-rated Course-based Undergraduate Research Abilities, which were then organized in a matrix to demonstrate the ways these research abilities enabled students to achieve seven ALOs (e.g., interpret data to understand a biochemical meaning) across the computational and wet lab modules of the BASIL curriculum (10). Further discussion on results from the BASIL CURE can be found in section III.B and section III.C.3.

III. EXPANDING, ADAPTING, AND EVOLVING

After we had assembled a stable curriculum and a strong core team of users and developers, we addressed challenges that arose with implementing the curriculum. This section contains information on how we identified and addressed challenges in four areas that affected BASIL implementation on many campuses: software installation, the COVID-19 pandemic, sustainability, and adapting to advances in technology.

A. Software installation

The original BASIL curriculum required instructors to install software on their computers (e.g., PyMOL, the ProMOL plug-in for PyMOL, and PyRx) (1, 2, 6). This was a persistent problem for several reasons, with the most obvious being use of a Mac versus Windows device. Versions of these programs were available for both operating systems, but the versions were not identical.

Cost was also an issue. The ProMOL plug-in worked with the educational version of PyMOL for Windows but only with the purchased version of PyMOL for Mac. We began using PyRx for our docking studies. PyRx is built on open-source software (AutoDock, Open Babel), but a licensing fee is charged (5, 11). Furthermore, PyRx function was lost during a Mac operating system upgrade several years ago and has not been restored.

A third issue occurred at the institutional level, where campuses often have firewalls in place to protect against malicious software. In addition, many campuses will not install or allow installation of software that is not from a major vendor (e.g., Microsoft, Adobe) on campus computers. Finally, for students using public computers or basic notebook computers (e.g., Chromebooks), installation of this software was not possible. This last issue is a particular concern on campuses with limited resources.

The solution to the installation problem included needing to develop or find web applications that could be opened from a standard Internet browser in any operating system (Fig 1).

Fig 1Fig 1Fig 1
Fig 1Responses to software challenges and innovations in machine learning. Three of the original computational modules required significant changes. New computational developments in protein structure and function prediction yield the potential for new modules. API, application program interface; BLAST, Basic Local Alignment Search Tool; CLEAN, Contrastive Learning-Enabled Enzyme Annotation; OS, operating system; SPRITE, Structural Protein Motif Database Searching Program.

Citation: The Biophysicist 6, 1; 10.35459/tbp.2024.000273

1. Module 1: Active site alignment

A team of software engineering students at RIT was recruited to develop a web application to replace ProMOL. The team was successful in developing Moltimate, which provided active site alignments for query structures. However, we quickly found that even minor changes in the Mechanism and Catalytic Site Atlas website, which provides the active site templates, or the PDB, which is the source of the protein structures, resulted in application failure (12). We learned that a huge difference occurs between creating a functional web application and maintaining it over time because websites change their application programming interfaces.

In response, we began looking for available web applications to fill this role. Bonnie Hall (Grand View University, Des Moines, Iowa), who joined BASIL in 2018 after meeting Roberts at a BioMolViz workshop, identified the Structural Protein Motif Database Searching Program (SPRITE) as a possible replacement for ProMOL and PyMOL for active site alignments (13). SPRITE allows a user to upload a PDB identification and then search that protein structure for known active site motifs (14). An additional benefit of using SPRITE is that users can view their results directly on the website without the need to install specific molecular visualization software. Steve Mills (Xavier University, Cincinnati, Ohio), who joined the BASIL team after meeting Pikaart at the American Society for Biochemistry and Molecular Biology Transforming Undergraduate Education in the Life Sciences meeting in San Antonio in Summer 2019, worked with Hall to create the active site-alignment module. The module provides instructions to enable users to perform their alignments with SPRITE and then view the results in more detail by using locally installed software (i.e., PyMOL, Chimera [UCSF Resource for Biocomputing, Visualization, and Informatics, San Francisco, California]).

Molecular visualization software is especially helpful with the BASIL curriculum. PyMOL and Chimera are two popular programs for viewing protein structures, and we still recognize the value of local software installation (2, 15). Both programs require local installation on the user’s computer and can be accessed for free; however, PyMOL requires users to apply for the educational free version. Initially, we focused on using PyMOL, but then Jon Dattelbaum (who joined BASIL in 2020 after a colleague shared one of the BASIL publications with him) and Mills advocated for using Chimera as the preferred program because it is free for anyone, and it integrates well with SwissDock (16). The latest active site-alignment module on the BASIL website is based on SPRITE and supports visualization with Chimera (15).

2. Module 3: Protein families

Challenges with software installation were encountered with modules 1 and 5. The Pfam alignment in module 3 was built around the Pfam web application; this provides an example of the disadvantage of relying on website tools for the BASIL curriculum (17). When websites are updated or discontinued, the modules must be updated to remain useful. In January 2023, the Pfam website was decommissioned. The functionality of Pfam was transferred to the InterPro website, which combines the functions of multiple sequence search engines and enhances the function of Pfam; however, the input and output from this newer website is different from that described in the original BASIL Pfam module. In response, the modules team (Sikora, Mills, Dattelbaum, and Koeppe) created a necessary update of this module, renaming it to reflect the new InterPro web interface: Using InterPro to Predict Protein Function (18). The learning goals for the module and the content remain largely the same, but care was taken to ensure that all instructions and screen captures reflect the InterPro implementation of Pfam. It is clear that modules must be maintained and updated regularly (more detail is provided in section III.D.4). It should be acknowledged that this work is often a labor of love, because grant funding is never permanent and does not cover the full costs associated with module development.

3. Module 5: Molecular docking

The original BASIL docking module was based on PyRx, which incorporates Open Babel and AutoDock (5, 6, 11). This required that faculty set up their own servers to run the software—a daunting barrier for many because of the necessary computer resources and expert knowledge. To overcome this challenge, Mills, Koeppe, and Dattelbaum adapted the docking module by using the online tool SwissDock (15, 19, 20). The general user interface and online help tools provide servers that run the docking calculations. Protein structures can be added by PDB identification or as files in PDB or mmcif format (21).

SwissDock provides an extensive library of molecules ready to dock, or users can upload their own. Results can be viewed directly on the website. Alternatively, SwissDock results can be exported to Chimera for a more detailed analysis. SwissDock typically identifies several potential binding sites on the protein, including the active site. Docking can also be set up to fit the small molecule into the SPRITE-identified active site. The SwissDock results may provide evidence to support the proposed enzyme active site. However, students must use care interpreting SwissDock results with different ligands when they attempt to determine the best substrate for activity assays, because SwissDock does not include a straightforward path to consider which substrate might be best.

B. COVID-19 pandemic

The BASIL team identified and described several strategies used during the COVID-19 mandated emergency shift to a remote-learning environment (22). The modular nature of the BASIL curriculum allowed for a great deal of flexibility while maintaining the ability to teach complex biochemical concepts and help students develop research skills. Shifting to remote teaching presented many challenges, but common hurdles across BASIL adopters were the inequity in computational resources among students and inability to be present in a laboratory setting.

Solutions in response to the challenges presented by the COVID-19 pandemic included using the existing computational modules, adapting the wet lab modules, and introducing Proteopedia for more interactive project reports.

We found several approaches to effectively overcome the challenges: incorporate breakout rooms where one member of a student team performed the bioinformatics experiment while others assisted, provide videos of laboratory procedures, and use BASIL modules explicitly created for remote environments and incorporation of data sets from previous semesters. Instructors also used small-group and one-on-one meetings with students to develop and refine hypotheses. Multiple-choice questions were also used to probe student knowledge and increase their interaction during online lectures and laboratories. Despite the jarring shift from in-person instruction, students were still able to take an active role in research that culminated in posters and written and oral reports of their research.

Furthermore, we were able to show an improvement in student knowledge, experience, and confidence of ALOs specific to the BASIL CURE as well as bioinformatics and biochemical techniques taught during the BASIL CURE. These self-reported gains were obtained from a tailored Participant Perception Indicator survey developed by Irby et al., which contains questions targeting specific ALOs, wet lab techniques, and computational programs (23). These gains are seen in a limited sample of Winter 2020 students and supported by subsequent research from a larger student population during the completely virtual Summer 2020 semester (22, 24). One of our long-term goals is to build capacity in the BASIL community for measuring actual changes in student learning.

1. Seamless use of the computational modules

After the initial emergency responses described in Sikora et al. and summarized previously, the BASIL team identified a need to develop tailored experiments for the virtual environment (22). The five computational modules could be easily adapted to an online setting. These experiments were used in their unmodified form as part of an online laboratory experience; they focused on the same learning goals and learning objectives that would occur face-to-face, by using the existing assessments and instructor’s notes.

2. Simulating the wet lab modules

The six remaining wet lab modules contained a great deal of information on the experimental theory and detailed procedures, in addition to video explanations, sample student data, and postlaboratory exercises. We believed that using these laboratories in a remote-instruction setting exposes the students to the theory of these methods and the data analytics; however, it does not expose them to the critical thinking required to make decisions at the bench, when things often do not happen exactly as expected. Arthur Sikora, who joined the BASIL project in 2016 after meeting Koeppe at a Biophysical Society meeting, led a team of BASIL faculty and students who used Google forms to design a series of online modules that simulated wet lab experiments with decision trees and built-in error recovery while maintaining the approach, ALOs, and instructor resources typical of the other BASIL modules.

The first step in this process involved adapting and updating current learning goals and objectives for the online environment. These modified tools guided the creation of interactive, adaptive experiences that challenge students to make choices they might encounter in a hands-on laboratory setting. By using the fully customizable Google forms platform, the narratives in the six wet lab modules were adapted to incorporate text and/or video teaching material and interspersed with interactive elements, such as multiple-choice or numerical answer questions that require a correct path of responses to proceed through each step. Multiple data sets from the work of previous students were central to this design.

Using the existing wet lab procedures, the BASIL modules team is finalizing the online modules for each experiment that focuses on exposing students to laboratory concepts in a virtual environment. Students were exposed to information needed to run the experiment in the laboratory setting. Modules also featured images of laboratory equipment and experimental procedures normally encountered during an in-person experience to further their immersion into the material. Using multiple-choice questions, students could explore checkpoint concepts frequently throughout the procedure. These assessments were designed to be low-stakes iterative learning experiences for students who would have to continue to answer the question until the correct answer was selected. Just-in-time information, feedback, and hints were included to stimulate understanding of incorrect answers and lead the students to the correct choice. Several free-response questions were also included throughout each of the fully online modules. Each Google form provides a summary of student responses, which are organized by student-entered email address, that faculty can access as needed. We are planning to release the full suite of online BASIL modules during the Fall 2025 semester.

3. Proteopedia

Another of our long-term goals is to design course activities that allow students to communicate their findings to stakeholders, which has been identified in CUREnet as a critical component of CUREs (25). One of the challenges of offering instruction in a CURE format is generating enough data for a peer-reviewed publication. Each student or group works on the project for a single term, so progress is slow, and variations across campuses (e.g., protocols, protein preparation) occur. Hall attended a Proteopedia workshop led by Jaime Prilusky and Joel Sussman at the 2018 Biennial Conference on Chemical Education. We now have a collaboration with Proteopedia that allows BASIL student participants to publish their work (26). The motto of Proteopedia is “As life is more than 2D, Proteopedia helps to bridge the gap between 3D structure and function of biomacromolecules.” Proteopedia aims to

organize and disseminate structural and functional knowledge about biomacromolecules, their assemblies and interactions with small molecules, in a user-friendly way to a broad scientific audience as a free, collaborative 3D-encyclopedia of proteins and other biomolecules. (Proteopedia)

Some BASIL instructors have assigned students a private Proteopedia page where they can record their results, upload images, create tables, and use PDB files with the provided graphical user interface to create interactive images of their protein structures and docking results (26). The faculty mentor reviews the work at the end of the term for completion and any copyright issues. After the mentor has approved the page as having sufficient content and quality, it is then made public in the BASIL category. A DOI can also be requested for each public-facing page. This results in published content for the students and the faculty mentor and also enriches the experiences of other stakeholders in the process. Published pages based on three PDB entries include 3R8E, 3HDT, and 1ZBS. Students have also published three pages based on Uniprot entries: Q8DN35, P30646, and P76586. Even if the students’ work is not ready to be made public, they still learn how to assemble their data into a publication format and how to leverage 3D images to assist in telling their story.

C. Sustainability

BASIL began in 2015 with a team of faculty members from seven different campuses, supported by funding from the NSF IUSE program. Initially, we met via video conference on a weekly basis to discuss current issues and plan future development. In 2017, our first manuscript comprised interviews of the original adopters, including questions about challenges they faced, aspects that were or were not working, and their self-perception within the process (8). It should be noted that each interviewee wanted at least one full, in-person group meeting included in our budget, even though this was not one of the questions for the interview. We also obtained a second round of NSF funding from 2017 to 2021. As we began to grow, we realized that we could no longer support the BASIL project with weekly town hall–type meetings.

Our solution for sustainability included modifying the organizational structure; implementing a plan for diversity, equity, and inclusion; and adjusting the focus of our educational research.

1. Changes to the BASIL organization

As we entered 2021, we found we had created valuable resources and some recognition at a national level and also that we were growing—as many as 12 campuses, with NSF support to expand significantly by 2027. In Summer 2022, the full BASIL core team met in Indianapolis, Indiana, for a long weekend before the Biennial Conference on Chemical Education at Purdue University. We reviewed our history, made plans to fulfill the promises in our recently funded NSF proposal, considered components of BASIL that were not included in the proposal but were required for our survival, and discussed a sustainability plan and leadership transition over the next 5 years.

We established six committees (assessment, communication, data management, instructor support, module development, recruitment), with two to four core team members on each committee. In addition, we formed a steering committee of four members, with the intention that the original principal investigator on the project would transition leadership to the three other members as he began transitioning to retirement. Sustainability was the focus of a presentation at the DiscoverBMB 2024 meeting of the American Society for Biochemistry and Molecular Biology in San Antonio, Texas; we also plan to prepare a manuscript that will more deeply explore aspects of sustaining CUREs that address aspects of leadership, funding, participation, mentoring, maintaining resources, supporting the community, building a strong education research component, mentoring, and professional development.

2. Diversity and accessibility

One of the current goals for the BASIL project is to contact faculty from Historically Black Colleges and Universities (HBCUs) and Minority-Serving Institutions to gain their perspective on barriers to participation in CUREs (including BASIL) based on institution types and to provide opportunities (with ongoing support from the BASIL community) for professional development.

On the advice of our advisory board, we offer workshops on campuses around the nation, including two workshops at HBCUs to date. Our first in-person workshop, held at Fayetteville State University (FSU; an HBCU in North Carolina), was based on an existing connection between Craig and James Raynor, a professor at FSU, through the National Institute of General Medical Sciences Undergraduate Research Training Initiative for Student Enhancement program, which is hosted at both RIT and FSU. Raynor helped the BASIL team host a workshop with colleagues from other local campuses, including several other HBCUs.

After an introduction by James Raynor, we held a second in-person workshop at Prairie View A&M University (Prairie View, Texas) hosted by Harriette Howard-Lee Block. Our plan is to offer two virtual and two in-person workshops each year through 2026. Another aspect of inclusion is to ensure that our protocols and materials are freely accessible to all; we develop relationships with suppliers, such as DNASU, to provide BASIL materials at minimal expense; and we create alternative protocols for instructors who have limited instrumentation in their teaching laboratories (e.g., alternatives to spectrophotometers and sonicators) (27).

Another goal in BASIL is to support faculty development at all levels. We begin our discussions with new members on the tenure and promotion policies (what is written) and expectations (what is not written) on their campuses. We emphasize regular publication and encourage members who are considered for tenure and promotion to take the lead wherever possible. We also emphasize the pursuit of external grant funding, helping new members identify new funding opportunities that will enhance the BASIL project or develop into new projects. BASIL participation has helped several core team members in the tenure and promotion process.

3. Focus of BASIL education research

Early in the development of BASIL, it became clear that having a highly qualified, discipline-based education researcher with a Ph.D. student was essential to sustainability. We needed the researchers’ input to shape our decisions, develop our curriculum, and assess student learning and faculty development. Trevor Anderson (Purdue University) was our original education researcher, and his Ph.D. student, Stefan Irby, analyzed the development of BASIL, which resulted in three publications that have had a major effect on our curriculum and team members (9, 10, 23). Irby also provided advice on assessments that helped us shape the modules that constitute our curriculum. More recently, Erika Offerdahl (Washington State University, Pullman, Washington) has joined our project as lead educational researcher for BASIL. She and her Ph.D. student, Diane Ogedi Ugwu, are focusing on the experiences and challenges that faculty members encounter in implementing the BASIL CURE in several campus settings. Initial observations are provided as follows.

Instructors express initial interest in implementing BASIL because of its potential to provide students with authentic research experiences, integrate computational tools into the laboratory curriculum, and increase student engagement and interest in biochemistry. However, instructors from different institutions face several challenges—categorized as technical, instructor related, and student related—that range from minimal to long-term effects (7). Faculty report facing logistical challenges, including a lack of time, resources, and support while transitioning to a CURE-type laboratory course (8). These are primarily associated with developing instructional modules that can be adapted on multiple campuses, experiencing software compatibility issues, and having a need for more technical support. Instructors further expressed concerns about assessing student-learning outcomes and the potential effect of implementing the BASIL CURE on their research productivity. Instructors also face challenges in developing materials and resources, preparing for the course each week, learning unfamiliar techniques, and addressing change on multiple levels (8). Most of these initial challenges were linked to the modules not being fully operational when the instructors began teaching the laboratory sessions. Instructors also need additional training and support in implementing the BASIL CURE approach, particularly in computational tools and course design (7).

Despite these challenges, instructors remained enthusiastic, presented the BASIL modules at their favorite national conferences, and shared the approach with their colleagues (8). In addition, the BASIL community conducted weekly video conferences to provide robust peer support while also addressing challenges faced on individual campuses (7, 8). This collaborative effort helped refine the BASIL CURE modules, paving the way for their seamless integration in the future. Because the BASIL CURE model presents a valuable platform for students to participate in research, its successful implementation requires meticulous planning and unwavering support. To this end, flexible implementation of the BASIL curriculum could also help resolve many implementation obstacles. Instructors who had the freedom to adapt the BASIL modules to fit their specific teaching needs and constraints were more likely to implement BASIL successfully (7). This flexibility allowed faculty members to tailor the laboratory experience to their students’ needs and abilities while addressing their own resource and time constraints. During the COVID-19 pandemic, we adapted and offered the BASIL modules for remote instruction (22).

Providing insights into students’ experiences and benefits with the BASIL CURE, Irby et al. studied students’ perceptions of specific BASIL ALOs regarding knowledge, experience, and confidence changes after participating in a BASIL CURE (23). Students perceived significant gains in the ALOs regardless of implementation style (i.e., computational or biochemical–wet lab). Specifically, students reported having more confidence in their ability to design experiments, analyze data, and communicate scientific findings (23). Additionally, students indicated that the CURE laboratory experience helped them better understand the research process and improved their critical thinking skills. They further revealed that the BASIL CURE was challenging but rewarding, providing a more authentic research experience than traditional laboratory courses. Ultimately, they reported being more engaged and motivated in the CURE, which led to increased learning and a greater appreciation for research processes.

Through the NSF IUSE: Education and Human Resources program, the future goal of the BASIL project is to assess the role of faculty support in the adoption, implementation, and sustainability of the BASIL CURE strategy in diverse institutional contexts, particularly in Minority-Serving Institutions, HBCUs, tribal colleges, and community colleges. As a result of this implementation and assessment process, the project will contribute to strengthening the diverse workforce in sciences, technology, and medicine.

D. Adapting to advances in technology

One of the challenges in our project is to ensure we remain current with developments in the life sciences. About one-half of the BASIL modules use web applications and/or programs to predict the function of proteins, which are then tested in the wet lab. The field of protein-function prediction is changing dramatically with the development of new algorithms, particularly as artificial intelligence and machine learning (AI/ML) are applied to predicting protein structure and function. We are developing several modules that incorporate these new AI/ML tools.

To address adaptation to advances in technology, we will identify and incorporate new developments in protein structure and function prediction.

1. AlphaFold

Recent breakthroughs in protein structure prediction using AI/ML methods have resulted in a massive increase in the number of proteins with solved structures and unknown function. AlphaFold created a protein structure database containing >200 million computed predicted protein structure models (28, 29). Users can also submit their own sequences to obtain a predicted protein structure. In addition, individuals can use public code and notebooks to apply the algorithms themselves to predict and generate structures for proteins that do not have experimentally determined models in the Research Collaboratory for Structural Bioinformatics PDB. Given the number of genome sequences available, AlphaFold and other protein structure databases, such as Evolutionary Scale Modeling Fold and RoseTTAfold, could provide a virtually endless supply of protein structures that could be characterized with BASIL (28–31).

2. Foldseek

One of the challenges we face is identifying structures of interest in protein families that are familiar to faculty members who are using BASIL. The Dali module of BASIL, which compares protein structure information, can provide good insight into a protein’s function but can take as long as several days to return results. Foldseek performs a similar comparison using a new structural alphabet based on 3D interactions (32). The software leverages a trained ML model to reduce computing time four to five orders of magnitude, which enables users to rapidly search the AlphaFold repositories.

BASIL faculty and students have exhausted the potential serine hydrolases of unknown function found in the experimentally determined protein structures in the Research Collaboratory for Structural Bioinformatics PDB; however, a Foldseek search with PDB entry 1a0j (trypsin from Salmo salar) or 1pq5 (trypsin from Fusarium oxysporum) yields many predicted serine hydrolases to be explored. Protein sequences of interest can then be converted to genes that are codon-optimized for expression in Escherichia coli; subsequently, these could be expressed and subjected to the BASIL in silico or in vitro analytical approach. Foldseek has been piloted as part of a BASIL curriculum module by students working with Hall, Mills, and Koeppe. An updated module that uses Foldseek is in preparation.

3. AI/ML

ML has now also been applied to predicting protein function. Quite a number of powerful ML models have been published. These use protein sequence and/or protein structure to predict protein function. Some examples with public-facing submission sites include NetGO 3.0, DeepFRI, and PredictProtein (12, 33, 34). These web applications return a list of Gene Ontology terms that describe a protein’s possible function in detail (35, 36).

The Contrastive Learning-Enabled Enzyme Annotation (CLEAN) web application returns a predicted function by Enzyme Commission classification (37). A new BASIL module leverages the web-based version of CLEAN, a powerful addition to the BASIL curriculum. Use of the CLEAN algorithm to generate an initial protein-function prediction was piloted by students at Grand View University (Des Moines, Iowa) with Hall to characterize putative kinases. Students still used the BASIL modules to make a protein-function prediction but began their research by using the web-based version of the CLEAN algorithm to obtain a predicted enzyme commission class for a protein. This helped direct students’ initial efforts in thinking about protein-function prediction, a critical step in a single-semester research experience. A prediction using CLEAN is especially valuable when using the original BASIL modules to predict a protein’s function produces conflicting data or only vague predictions of potential protein function.

4. Our development pipeline

The resources described previously will be converted into one or more formal BASIL modules by following our BASIL pipeline process (Fig 2). The same process is being applied to create modules on x-ray crystallography, western blots, and multiple sequence alignment. We start by preparing a student module from a BASIL template, which includes learning goals and objectives, introduction, protocol (including sections on purpose, supplies, safety considerations, procedures, and cleanup), interpreting results, and references. We use backward design to build our learning objectives—what we want the students to learn and apply in the module—which also facilitates writing the assessment questions. We then attempt to use the new module with students in the laboratory of one of the BASIL core instructors, where we seek student input to increase clarity of instruction. Then we create the instructor module with notes about alternative approaches to address issues of different facilities or resource limitations. As we design new modules, we make every effort to link directly to other BASIL modules to integrate the module into the full curriculum. After we have completed testing (usually on two to three campuses), we introduce the module to the full team in an online meeting so that others can use it. Once this process is complete and the new module is approved by the BASIL modules committee, the module is given to the members of our Data Management Committee, who post it on the BASIL Biochemistry website (38).

Fig 2Fig 2Fig 2
Fig 2The Biochemistry Authentic Scientific Inquiry Lab (BASIL) module development pipeline.

Citation: The Biophysicist 6, 1; 10.35459/tbp.2024.000273

IV. Opportunities

The BASIL community is a stable and growing organization that offers frequent introductory workshops as well as workshops focused on assessment in CUREs. Participants are welcome to join our online Slack community for discussing course-related issues (https://join.slack.com/t/onlinebasildiscussion/shared_invite/zt-23hiijwrr-c0tDbx8xSgNgjwconYwMVg). We are also building learning communities that are open to all participants where members can present the challenges they have encountered regarding implementation on their campuses or during the course of teaching the laboratory. To learn more about incorporating portions or all of BASIL in a course on your campus, please register for a virtual or in-person workshop on our Events page of the BASIL Biochemistry website (39). Please contact the corresponding author if you are interested in hosting a workshop at or near your home institution.

V. Summary

The focus of BASIL is to predict protein function for protein structures that lack functional annotation. We began with experimentally determined structures in the PDB, and much of our work continues there. At the same time, we are expanding our approach to include computed structure models provided by AlphaFold and RoseTTAFold. This project began with several years of individual undergraduate research projects, which we transitioned to a CURE in 2015. Since that start, we have found that we must adapt as our numbers increase and the available technology in our field continues to progress. One of our long-term goals is to establish a sustainable project in which students on all campuses have access to our resources and support and the opportunity to experience the thrill of scientific discovery as part of their regular course curriculum. Our second long-term goal is to provide faculty members who implement BASIL with effective onboarding, ongoing support, and opportunities for professional development that lead to tenure, promotion, and career fulfillment. BASIL is still growing—with new AI/ML tools, this project can support a small army of undergraduates using computational and wet lab methods to predict and confirm protein function. Please contact the corresponding author to learn more about getting started with BASIL resources and opportunities for hosting and/or participating in workshops.

Copyright: © 2025 Biophysical Society. 2025
Fig 1
Fig 1

Responses to software challenges and innovations in machine learning. Three of the original computational modules required significant changes. New computational developments in protein structure and function prediction yield the potential for new modules. API, application program interface; BLAST, Basic Local Alignment Search Tool; CLEAN, Contrastive Learning-Enabled Enzyme Annotation; OS, operating system; SPRITE, Structural Protein Motif Database Searching Program.


Fig 2
Fig 2

The Biochemistry Authentic Scientific Inquiry Lab (BASIL) module development pipeline.


Contributor Notes

corresponding author
Received: 26 Mar 2024
Accepted: 17 Oct 2024
  • Download PDF