## I. INTRODUCTION

The epithelium is a sheet of tissue composed of tightly packed epithelial cells that line the surface of an organ, such as the intestine or skin. Although the packing and shape of epithelial cells may seem random, they are direct results of the intracellular and intercellular mechanical forces. These mechanical forces are intricately governed by well-defined molecular events and geometry rules. Epithelial tissues are arguably one of the most studied and elegant examples for investigating tissue mechanics. These cells form strong adhesion with each other and with the substrates. These properties enable epithelial cells to adopt distinct shapes and sizes in response to forces generated from adhesions, density, and movement (1). Maintaining stable and adequate cell morphologic features is also critical for supporting the function of the epithelial sheet. This epithelial sheet performs a number of vital tasks, including absorption of nutrients into the gut, secretion of proteins that form a protective hydrogel in the lungs, and protecting the body from pathogens by forming a barrier on the skin.

Dynamic changes in epithelial cell morphologies and packing also determine the overall tissue shape that is fundamental for organ development and tissue regeneration (2). Aberrations in these processes can lead to developmental defects and other pathologic conditions, including cancer (3).

Cell shape and size can be modulated by forces generated through different processes, both from within the cell (e.g., actomyosin contractility) and between cells (e.g., cell–cell adhesion and boundary constraints) (4). Meanwhile, scientists have observed striking similarities between the aggregates of epithelial cells (Fig 1A) and other jammed physical systems (Fig 1B), such as soap bubbles (5). It has been shown that the topologic and morphologic features of epithelial tissues can be predicted by geometry rules and mechanical force balance (6). Recent numeric simulations have further shown that the dynamics of epithelial morphology and corresponding tissue structure changes can also be explained by using physical models (7). All these findings suggest that even though cells are not bubbles, the shapes and packing are still strongly influenced by physics principles.

Despite the exciting growth of research in cell-level biophysics, for example, jamming transition of epithelial tissues and cell collective motion (8–10), current undergraduate biophysics curricula still heavily focus on physics at the subcellular and molecular level (11). In addition, curricular topics on cell mechanics are often restricted to the mechanical properties of tissues or single cells. Such focuses overlook emergent phenomena and governing laws, which determine macroscopic observables from microscopic ones in biologic tissues (12, 13). In this article, we introduce an active learning module that enables students to learn how the Euler polyhedral formula and interfacial tension (force at the interface of 2 nonmixing liquids) jointly explain the fundamental morphologic features of epithelial cells. Here, the Euler polyhedral formula describes how the number of faces, vertices, and edges of a polyhedron (e.g., cuboid, prism, and pyramid) are related.

By combining light microscopy with a free and publicly available analysis software, students will learn how to quantify cell morphology and compare the measurements with theoretic predictions. To achieve this goal, we have identified commercial histology slides of frog skins for demonstrating the common distribution of cell aspect ratios (ARs) and the constant cell–nucleus size ratio. The use of premade histology slides allowed us to conduct the activities without biosafety concerns. We have designed the module to equip students with both conceptual understandings of cell mechanics and practical experiences in image acquisition and analysis. Through performing this module, students will understand the morphologic signatures of jammed epithelia and gain technical skills in using artificial intelligence (AI)–based segmentation tools.

## II. SCIENTIFIC AND PEDAGOGIC BACKGROUND

We provide a brief review of the biophysics background to supply the theoretic understanding that complements the hands-on activity reported in this work. We focused on structurally simple epithelial cells and the nuclear morphologic features, both of which can be observed by using light microscopy and quantified by using simple image analysis pipelines. Specifically, we discussed how mechanical force balance determines the cell geometry and cell–nucleus size ratio. To gain a more comprehensive understanding of epithelial mechanics, the students and instructors of the module are encouraged to read recent articles (14–16). At the end of this section, we provide a discussion on pedagogic background and suggested teaching tips.

### A. Epithelial cell morphology and intercellular junctional tension

One morphologic hallmark of mature epithelial tissues is the establishment of the tricellular junction network and hence the hexagonal packing of cells under steady state. This hallmark is established through adhesion at intercellular junctions (17–20). D’Arcy Thompson, a pioneer in mathematic biology, was one of the earliest scientists that researched the origin of these features. In his 1917 book, *On Growth and Form* (21), Thompson suggested that the cobblestone shape of cells (Fig 1A) arises from the interfacial tension along the intercellular junction, akin to soap bubbles (Fig 1B). In this context, the straight ridges separating cells simply result from the minimization of “interfacial energy” (22).

Building upon this analogy between epithelial cells and soap bubbles, the epithelial topology, specifically, how the cell–cell junctions are connected and the number of neighbors each cell has, can be derived from geometry rules. In 2 dimensions, the sum of the numbers of cells (*F*) and vertices (*V*) minus the edge number (*E*) is always a constant, *V* – *E* + *F* = 2. This relationship is known as the Euler polyhedral formula, and the constant is called the Euler characteristic, which is typically 2. A direct consequence of the Euler polyhedral formula is that
(1)
$$\frac{1}{\langle n\rangle}+\frac{1}{\langle z\rangle}=\frac{1}{2}$$
where
$\langle n\rangle $
is the average number of walls per cell and
$\langle z\rangle $
is the average number of edges that meet at a vertex. Detailed derivation of this equation from the Euler polyhedral formula can be found in a previous review paper (23). In a monolayer of cells, each vertex is typically joined by 3 walls, and cells are mostly hexagonal, implying that
$\langle n\rangle $
and
$\langle z\rangle $
are 6 and 3, respectively.

The tension force along the intercellular junctions not only determines the topology of the epithelial cell network but also impacts cell motion, as well as tissue structure and rigidity (24–27). Researchers have adapted numeric models that were originally used for simulating foams to describe the structural and mechanical properties of epithelial tissues. This simulation model, known as a vertex model, minimizes an effective energy as a function of the positions of cell vertices (28). In a general version of the vertex model, each cell stores a mechanical energy
(2)
$${E}_{\text{cell}}=\frac{{\kappa}_{A}}{2}{(A-{A}_{0})}^{2}+\frac{{\kappa}_{P}}{2}{(P-{P}_{0})}^{2}$$
that implies the existence of an *optimal* cell area and perimeter due to intercellular junction tension. Here, *A*,
${A}_{0}$
, *P*,
${P}_{0}$
are the area, optimal area, perimeter, and optimal perimeter of a cell, respectively. Also,
${K}_{A}$
and
${K}_{P}$
are the parameters related to the cellular junction tension, actomyosin contraction (i.e., tensile stresses generated by myosin motor proteins pulling on cross-linked actin bundles), and cell stiffness (e.g., Young modulus and bulk modulus). Using this effective energy, scientists have been able to describe various biologic processes, such as gastrulation, appendage formation, cell migration, and vesicle growth (29). One recent important finding is the epithelial structural change associated with the jamming transition, where cell proliferation and motility dramatically slow down, as the cells crowd (30). Vertex model simulations identified a critical value of mean shape index
$q={P}_{0}/\sqrt{{A}_{0}}=3.81$
. In simulation, when *q* is larger than 3.81, the cells are motile, capable of rearranging, and tissues are “fluidlike.” Conversely, when *q* becomes smaller than 3.81, the cells become immobile, and tissues are “solidlike” (31). In this perspective, the shape index acts as an order parameter that identifies the fluid–solid transition of epithelial tissues. This is akin to traditional phase transitions, including liquid freezing, the paramagnetic-to-ferromagnetic transition (i.e., change of magnetic properties of materials), and Bose–Einstein condensation (i.e., a quantum phenomenon in which a large number of bosons simultaneously occupy the system’s ground state).

### B. Nucleus size regulation

The nucleus is the largest and the most prominent cellular organelle. Since more than a century ago, scientists have found that although the size of the nucleus can vary greatly from cell to cell within a tissue, it is correlated with the size of the cell (32). In brief, the ratio between nuclear size and cellular size, the karyoplasmic or nuclear-to-cytoplasmic ratio, has been found to be a constant across numerous cell types. Despite the long history of this surprisingly simple and common finding, researchers are only beginning to understand its governing mechanism. It has long been postulated that the nucleus size is directly regulated by the DNA content (33). However, the current view posits that the cell size is the main regulator of nucleus size (34). Recent experiments showed that the main factors determining the nuclear-to-cytoplasmic ratio include the nuclear and cytoplasmic contents (e.g., proteins), the nuclear-to-cytoplasmic shuttling of proteins, nuclear membrane stiffness, and the cytoskeletal connection between the nuclear membrane and the cytoplasm (35). Specifically, recent theories have suggested that the osmotic pressure imparted by the nuclear and cytoplasmic contents may play a dominant role in deciding the nuclear-to-cytoplasmic ratio over other factors (36, 37). Through our learning module, the students will observe that the nuclear-to-cytoplasmic ratio is roughly a constant within a cell type, because these cells share similar gene expression profiles and osmotic pressure across the nuclear envelope.

### C. Pedagogic background

We include a breakdown of the scientific concepts covered in this module and its corresponding teaching applicability for each education level, ranging from elementary to graduate-level education (Table 1). Suggested modifications or teaching tips for each module component will be discussed within the appropriate section throughout the text. In brief, our module contains 3 main teaching aims covering biophysics concepts, image acquisition and processing, and data analysis. Conceptually, we envision that most students at any education level can understand the epithelial–foam analogy by using visual aids. For example, because understanding phase transitions is listed as a topic within the Next Generation Science Standards for fifth grade (38), we anticipate that all students beyond elementary school will be able to understand the jamming transition as a state change from a fluidlike to a solidlike system. As a teaching tip to facilitate motivation for the module, students can experiment with some bubbles, or foam, to visually observe how the system packs. Alternatively, counting vertex-forming boundaries of foam can be issued as a prelab for more independent students. Due to its complexity and interdisciplinary nature, we recommend reserving nucleus-to-cytoplasm ratio discussions for college-level students.

Regarding hands-on skills covered in the image acquisition and processing section, we anticipate that all levels above third grade will be able to operate a standard light microscope, in which there are only 2 main adjustable parameters to control the light intensity and focus plane. Because segmentation may require some quality control performed by students, we anticipate students beyond elementary school to be adequately equipped for this, and advanced high school or college-level students to be capable of performing iterative model training.

Finally, with respect to analysis, we anticipate that intercellular junction counting can be performed by students beyond elementary school, while generating AR histograms can be performed by advanced high school or college-level students. Because investigating nuclear-to-cytoplasmic correlation integrates fundamental principles from life science, physical science, mathematics, and statistics, we anticipate that only college-level students will be able to fully appreciate this analysis. We recognize it is critical to introduce image analysis to students at a young age to develop mathematic and scientific intuition, so we have included analysis adaptations that can be conducted by elementary-level students in section V of this article.

## III. MATERIALS AND METHODS

Our experimental module can be divided into 3 main parts, as summarized in Figure 2A. The first part is the acquisition of histology images (top box, Fig 2A). After obtaining the images by using light microscopy, the characterization of cellular and nuclear morphology will be conducted in 2 sequential steps. As shown by the middle box in Figure 2A, the images are first segmented to outline the boundary of cells and nuclei, effectively separating them into distinct objects. The recorded boundary coordinates are then used for calculating the enclosed area and AR of each object (bottom box in Fig 2A). Finally, to obtain the nuclear-to-cytoplasmic ratio of each cell, the cell and nucleus measurements are paired by matching the centroids, which can be directly obtained as a measurement in ImageJ, version 2.14.0/1.54f (39), with respective ratios subsequently calculated. We anticipate that the image acquisition and AI segmentation will take approximately 1 h each per histology slide, and corresponding morphologic analysis will take about 6 h.

### A. Histology image acquisition

In this work, we used commercial histology sections of the epidermis (outer layer) of frog skins (Triarch, Ripon, WI) to study the 2-dimensional (2D) epithelial cell morphology. Finding suitable 2D epithelial histology sections could be challenging because epithelial monolayers are usually only found in tubular (kidney) or glandular (prostate) structures, which are typically small (e.g., <100 $\text{\mu}$ m) and difficult to be flat mounted on microscope slides. We addressed this challenge by studying the outermost layer (i.e., stratum corneum) of frog skin, which is composed of a layer of terminally differentiated epithelial cells. As illustrated in a cross-sectional view of frog skin (Fig 2B), such a layer of cells is substantially different from and thinner than the multilayered stratum germinativum underneath it and, therefore, can be roughly considered as a 2D epithelial system.

We used a transmitted light microscope (Olympus CKX41, objective 40× numerical aperture [NA] 0.55, Olympus, Breinigsville, PA) to image the histology slide. An example image is shown in Figure 2C. We used a relatively high NA objective to suppress light scattering from background cells and extracellular matrix. Although the background suppression, hence the use of high NA objectives, may not be essential, it can potentially improve image qualities and image segmentation robustness. Images acquired by using different objectives and modalities of transmitted light microscopy are shown in Figure 2D–F. We also found that it is unnecessary to capture a large field of view by using low-magnification lenses, because the skin sample is often wrinkled, and only up to approximately 50 cells can be focused simultaneously. Based on these observations, we envision that any other student compound microscope (e.g., AmScope B120, Optika B-60, Meiji MT-30, or Leica DM300) would produce a similar image quality. Also, because we capture only approximately 50 cells per field of view, a 1-MP camera can readily resolve the morphology of cells and nuclei. Finally, we have uploaded 50 example images (40× phase contrast) used for analysis in this work as supporting materials (40). These images not only enable students without access to a microscope to complete the module but also serve as a reference for segmentation testing.

### B. Image segmentation

In this work, we perform image segmentation that effectively extracts cell or nuclear contours by outputting the coordinates of the pixels that include the outline of the feature of interest. We use segmentation in order to later perform morphologic quantification. To perform image segmentation (Fig 3A), the acquired histology images were loaded into an AI-based segmentation software, Cellpose, version X (41). Although Cellpose is compatible with many imaging modalities (e.g., fluorescence, bright field, and phase contrast), we used phase contrast images as our input because this modality is widely accessible and can contrast the nuclear and cellar boundaries with ease. Cellpose currently has 2 versions: Cellpose 1.0 and 2.0. Briefly, Cellpose uses computational neural networks to analyze the imported images, and a numeric function of the corresponding image pixels is created to predict the outlines of features, such as cells and organelles. In Cellpose 1.0, the users use preestablished AI models for segmenting the images, whereas Cellpose 2.0 allows the users to either use preestablished models or train and correct new AI models. This AI-based segmentation tool has several advantages over traditional parametric methods. Cellpose is capable of accurately segmenting cells in a variety of imaging modalities. In addition, its deep learning approach reduces the need for manual intervention and can be trained with relatively few manual examples if the preset algorithms are not geared for the images being segmented. This image segmentation platform has been widely used in current research (42–44) and combined with many other cell characterization tools (45).

We used images saved as Joint Photographic Experts Group files to reduce the file size, although Cellpose is compatible with most bitmap file types. Once our image was imported, we adjusted the cell diameter parameter in the *Segmentation* module in the graphical user interface (GUI) to roughly match the diameter of the cell or nucleus, as indicated by the top yellow arrow in Supplemental Figure S1. All other parameters were left at the default setting, including the *flow_threshold* and the *cellprob_threshold*. Next, we executed the segmentation by choosing the appropriate compartment model from the *model zoo* in the GUI. For example, if cell boundaries are to be extracted, the *cyto* model, indicated by the bottom yellow arrow in Supplemental Figure S1, should be selected. Following this, the results were displayed as a mask overlay on the input image. To clean up data, we removed falsely detected masks and added masks that may have been missed, by control clicking and right clicking, respectively. After mask clean up (Fig 3B), we saved the outline boundary coordinates as a text file that would later be read into ImageJ, which is a common open-source software for image analysis (39). This step will be discussed further in the following. Analyses can also be performed by using other platforms, such as Python and MATLAB. We presented the ImageJ example in this work due to its user-friendly interface with no coding experience required.

We found that the nuclear segmentation results were suboptimal when using the Cellpose nuclear model Figure 3C. This may be attributed to the fact that Cellpose is often performed on fluorescence images, in which the nuclei are the only feature in each image with a high contrast. On the other hand, histology slides contain many other features with a lower contrast between them. To overcome this issue, we trained our own nuclear model. To do so, we first manually outlined
$\sim $
50 nuclei in an image (Supplemental Fig S2A) and saved the masks and images as a seg.npy file (Supplemental Fig S2B) and repeated this process for 4 additional images. To train our model, we selected “train new model with image + masks in folder” under the *models* tab of the GUI (Supplemental Fig S2B). This procedure resulted in the trained model appearing as an option under the custom models menu, which was then used for segmenting other images (Fig 3A). We uploaded our model for students to use because we recognize training can be challenging for most kindergarten to grade 12 students (40). Finally, the instructor can emphasize the importance of data verification. Students can manually segment a manageable number of cells (e.g., 10 to 50) and compare the results with those from Cellpose. This validation activity not only ensures the quality of data for further analyses but also emphasizes the importance of rigor in biophysics research.

### C. Morphologic analysis

#### 1. Morphologic feature measurements

To perform morphologic quantification, we first opened the same image used to generate segmentations in Cellpose 2.0 into ImageJ. We then ran the *imagej_roi_converter.py* macro, which is an ImageJ script automatically included in the Cellpose download, to overlay the segmented boundaries on the image by clicking “Plug-ins
$\to $
Macros
$\to $
Run.” We then selected the corresponding text file of outline coordinates generated from Cellpose when prompted by the macro (Fig 3B). To obtain the final morphologic features of interest, we selected *Measure* in the ROI Manager (Figure 3D). Specific morphologic features of interest such as perimeter or area can be selected by using the *set measurem*ents option. The results (Fig 3E) were then saved as a comma-separated value for further analyses performed in Excel Version 2310 (Microsoft Corporation, Redmond, WA). Although other platforms, such as MATLAB or Python can be used, we used Excel Version 2310 because it is accessible and intuitive to use while requiring no coding experience.

For the junction counting analysis, we examined 108 vertices across 5 images, where a vertex is defined as the point where 3 edges (i.e., cell boundaries) meet. Vertices were manually assessed and categorized as either 3-cell vertices (i.e., isolated tricellular junctions) or associated vertices (i.e., 2 tricellular junctions in close proximity).

#### 2. AR and nuclear-to-cytoplasmic ratio

We first characterized cell morphologic phenotype by calculating the cell AR, a direct readout obtained from ImageJ. Here, the AR is defined as the ratio between the major axis and minor axis, both of which are determined by fitting an ellipse to the outlined area. We calculated the AR for 500 cells across 19 images.

To assess the nucleus–cell area correlation, the nucleus and cell measurements from ImageJ were paired by using a MATLAB, version R2023a (The MathWorks, Natick, MA) code (40) that identifies the correspondence between cell and nucleus, based on the centroid coordinates of cell and nucleus outlines. Using this method, 500 cells and nuclei were paired. The paired cellular and nuclear areas were then used for generating a scatter plot and calculating the Pearson correlation coefficient, which is defined as (3) $$r=\frac{{\displaystyle \sum _{i}({A}_{c}^{i}-\langle}{A}_{c}^{i}\rangle )({\displaystyle \sum _{i}({A}_{n}^{i}-\langle}{A}_{n}^{i}\rangle )}{\sqrt{{\displaystyle \sum _{i}({A}_{c}^{i}-\langle}{A}_{c}^{i}\rangle {)}^{2}{\displaystyle \sum _{i}({A}_{n}^{i}-\langle}{A}_{n}^{i}\rangle {)}^{2}}}$$

Here
${A}_{c}^{i}$
and
${A}_{n}^{i}$
correspond to the cell area and nucleus area of the *ith* cell, respectively, and
$\langle \rangle $
denotes average. The *P* value of correlation was calculated by using Mathematica, version X (Xxxxx, Xxxx, XX). The same calculation can be performed by using other software, such as MATLAB, Prism, and the *scipy.stats.pearsonr* function in Python. Further discussion of the meaning of *P* value is provided in the section IV.

## IV. RESULTS

Using our imaging-segmentation pipeline (Fig 2A), we were able to reproduce 2 important findings of epithelial cell mechanics. In addition, we demonstrated the role statistics (i.e., sample size) plays in the quantitative analysis of the relationship between nucleus and cell size.

### A. Morphologic characterizations

As discussed in section II, the cell shape and junction network are 2 imperative features to consider when characterizing epithelial morphology, because they help regulate essential tissue properties such as the barrier function and rigidity (46). By analyzing 108 vertices, we observed that all of them included 3 edges. We also found that although most of the examined vertices were well separated (Fig 4A), a few of them were paired (Fig 4B). We summarized this counting result in Figure 4C, which demonstrates that $\sim $ 85% of vertices are isolated vertices (i.e., 3-cell vertices), and $\sim $ 15% of vertices are associated vertices (i.e., 2 vertices in close proximity). Our observation confirms that most cells in the tissue obey the behavior predicted by the Euler polyhedral formula. The observed associated vertices could be related to 4-cell vertices, which have been attributed to active structural rearrangements in living epithelial tissues (47). Events such as cell movements, cell proliferation leading to tissue growth, cell death, or cell extrusion can all contribute to changes in cell packing and organization. Our finding of associated vertices is also consistent with both reported experimental observations (48, 49) and vertex model simulations (47).

To further illustrate how the packing rule governs epithelial cell morphology, we investigated the degree of heterogeneity in our system. To do this, we calculated the probability distribution of cell AR for 500 cells (across 19 images). Here, the AR is defined as the ratio between the major and minor axes of an ellipse fitted to a cell. AR is a key morphologic indicator of epithelial phenotype, in which fully packed cells often exhibit a low AR (Fig 4D), or more rounded shape, as a result of interfacial tension energy minimization. However, because the cells are randomly packed, different cells would experience various degrees of geometric constraint created by their neighbors. As a result, it is anticipated that a small population of cells would still display a relatively high AR (Fig 4E) or elongated shape.

As shown in Figure 4F, we first confirmed that the mean AR of epithelial cells was low (
$\langle \text{AR}\rangle \sim 1.45$
). We further confirmed that the local packing gives rise to a relatively large spread in AR, indicating a noticeable morphologic heterogeneity in the epithelial monolayer sample (Fig 4F). This again highlights the dynamic nature of epithelial tissues, where active cellular events can introduce variations among the population. To compare the statistics of our results with other experiments and previous simulation predictions, we normalized the AR by rescaling the *x* axis of Figure 4F by using
$x=\frac{\text{AR}\hspace{0.17em}-\hspace{0.17em}1}{\langle \text{AR}\rangle \hspace{0.17em}-\hspace{0.17em}1}$
(Fig 4G). This rescaling effectively sets the distribution lower bound and mean to be 0 and 1, respectively. We then fitted our data to a gamma distribution, denoted by the black curve shown in Figure 4G. Here, a gamma distribution is defined as
(4)
$$\text{PDF}(x;\kappa )={\kappa}^{\kappa}{x}^{\kappa -1}{e}^{-\kappa x}/\Gamma (\kappa )$$
where PDF is probability density function,
$\Gamma (\kappa )$
is the Legendre gamma function, *x* is the variable, and
$\kappa $
is the shape parameter. This function is often used to describe a distribution that has a sharp lower bound of zero but no upper bound, and a positive skew. Such a distribution has been used to model numerous social and biologic examples, including the size of insurance claims, the age of cancer incidence, and the interspike interval between neuron firings (50–52). From our fitting, we determined the shape parameter
$\kappa =2.40$
, which is similar to the measurements using dog kidney cells (
$\kappa =2.39$
), *Drosophila* (
$\kappa =2.52$
), and the vertex model prediction (
$\kappa =2.53$
) (53).

Finally, we calculated the shape index $q={P}_{0}/\sqrt{{A}_{0}}$ of each cell and found that the mean value, $\sim 4.21$ , is slightly higher than the fluid-to-solid transition threshold 3.81, predicted by the vertex model. This discrepancy could be due to dynamic tissue rearrangement resulting from extrusion or apoptotic events. Specifically, as dying cells are shed off and new cells enter the stratum corneum, some cellular rearrangements could still occur and contribute to a larger measured shape index. We should, therefore, be reminded of the dynamic nature of a living tissue and the importance of considering other biologic factors when interpreting measurements of physical properties. In addition, because our analyzed frog skin epithelial cells are fully differentiated, the tissue should be completely solidlike. Such a deviation in shape index may be then due to the tortuous perimeter of our tested cell. This effectively leads to a higher shape index value, which was not considered in the original vertex model simulation. The origin of ruffled junctions (i.e., zigzag shape) and the biologic implications remain active research topics (54), which could be an interesting postlab discussion topic. Overall, we conclude that our morphologic analyses are in agreement with previous experiments and simulations.

### B. Nuclear-to-cytoplasmic area ratio

In this section, we analyzed the correlation between the nucleus and cell area. This ratio has been shown to be roughly a constant within a tissue sample but can vary across cell types. In testing the cell–nucleus correlation, we also demonstrated how statistics impact the analysis results and robustness, which play an essential role in interpreting research observations. We first plotted the nucleus area as a function of cell area by using all of the data points ( $N=500$ , shown in the left panel of Fig 5A). In this plot, we observed a clear positive correlation between the cell area and nucleus area. The observed correlation was confirmed by a moderate Pearson correlation coefficient $r\sim $ 0.67. Furthermore, the proportionality between the cell and nucleus areas was further confirmed by a linear fit (red line), ${A}_{n}=0.067{A}_{c}-1.883$ and its small intercept $1.883/\langle {A}_{n}\rangle \ll 1$ .

To observe how the statistics influence the correlation analysis, we systematically reduced the data points by sampling a subset of $N=100,\text{\hspace{0.17em}}\text{\hspace{0.17em}}50,$ and 10 points. As shown in Figure 5A, we found that the nuclear-to-cytoplasmic area ratio is preserved across our tested sample sizes. However, we also found that the data trend starts to vary when the sample size is less than 50, as indicated by the changing slope of fitted lines (red dashed lines). To more quantitatively characterize this finding, we calculated the Pearson correlation coefficient between cell and nucleus area for all tested sample sizes and repeated this calculation for 100 randomly sampled data subsets (Fig 5B). Consistent with our observation, although the mean value (Fig 5B, dark blue middle line) of the Pearson correlation coefficient does not depend on sample size, its standard deviation (Fig 5B, light blue band) does. In other words, the Pearson correlation coefficient remains unchanged when sampling only a few cells, although determining a trend becomes less definitive when the sample size is less than $N=20$ .

Finally, we analyzed how the correlation’s *P* value varies in response to sample size (Fig 5C). Here, the *P* value is the probability of observing a correlation in the tested data when, in fact, there should be no correlation (i.e., the null hypothesis is true). This statistical analysis is important to report for any biologic experiments, which often contain notable uncertainties and variations. Typically, a *P* value less than 0.05 (yellow-shaded region in Fig 5C) is accepted to be significant, suggesting that the null hypothesis can be rejected. Our results showed that the observed correlation becomes statistically significant at around a sample size of *N* = 20. This is supported by our results of Figure 5B in which we saw a high degree of reproducibility in the Pearson correlation coefficient beginning around *N* = 20. Together, these results demonstrate that students should analyze at least 20 cells in future experiments to obtain statistically significant data.

In our error analysis activity, the 500 data points were collected by using images from 3 histology samples, thus, presumably 3 biologic replicates. Biologic replications are critical for quantification and allow statistical analyses of the data. A potential caveat with purchased histologic slides is that one cannot guarantee the preparations from different individual animals. Therefore, although the activity was designed to demonstrate the importance of sample size, instructors should discuss how biologic replicates are typically achieved in laboratories, the differences between technical and biologic variations, and the use of biologic replicates for assessing statistical significance.

## V. DISCUSSION

One important implication of our introduced demonstration is the morphologic contrast between epithelial and mesenchymal cells. Epithelial cells express a number of adhesion molecules, including cadherins, tight junctions, and desmosomes, which function collectively to enable strong cell–cell adhesion. These adhesion molecules are also linked to actin cables intracellularly and allow cells to both transmit forces and respond to mechanical signals. Together, these molecules underlie the intercellular junctional tension and contribute to the prototypical epithelial cell shapes, as students would have observed in the frog skin. Instructors can probe students to think about what cells would look like if they did not have these adhesion molecules. Will packed cells be more rounded and less like cobblestones? Will cells still be able to move around freely in a crowded environment? Indeed, mesenchymal cells are distinct from epithelial cells in that they have reduced cell–cell adhesion and an elevated cell–matrix interaction. As such, packed mesenchymal cells often do not exhibit the same cobblestone morphology as epithelial cells and are more motile. Nonetheless, the morphologic features can be quantified by using the same method described here and investigated in the context of tissue mechanics.

Because our experimental module has 3 main components, we envision it would be best presented as a lab demonstration. For middle and high school students with roughly 50-min science class periods, the lab can be broken up into a 3-d experiment, where each day corresponds to each module presented in Figure 2A. For college students, the lab can be incorporated into lab curriculum or can be incorporated into lecture-based classes. In this case, students can acquire the data during the lecture’s discussion section and complete the segmentation and analysis as part of the lab write-up. If using this module as an outreach demonstration, we recommend giving all students the opportunity to work on the 3-module experiment collaboratively.

We anticipate that advanced high school students should be able to perform Cellpose model training. We recommend that elementary and middle school students use only the pretrained models if the instructor feels comfortable exploring the use of AI-based segmentation with advanced students (Table 1). If the user experiences difficulty in identifying nuclear outlines by using the pretrained models, we advise the student to use our trained model available for download (40).

We anticipate that all students ranging from elementary to high school should be able to conduct junction vertex analysis that mainly requires the students to be skilled in counting. For ease of visualization and to avoid double counting, elementary students can highlight edges contributing to vertex formation. To further simplify or save time for elementary school students, instructors can color code 3-cell vertices (isolated) versus associated vertices beforehand and ask students to count the number of vertices in each color.

## VI. CONCLUSION

In this work, we present a scalable teaching module exploring basic experiment and analysis skills in tissue mechanics research, including microscopy, image segmentation, and statistics. Using commercial histology slides of frog skin epithelial cell samples, we demonstrated the characterizations of junction vertex and morphology of jammed epithelial cells. By calculating the correlation between the cell and nucleus ratios, we also showed the role sample size plays in quantitative analysis. These cell properties are important to study in epithelial tissues, as they are known to associate with cell–cell and cell–substrate interactions and are often used to report on the state of cells. All the images and computational routines used throughout this work are publicly available. In conclusion, we envision our introduced learning activities will be useful for both classroom teaching and outreach programs.