The term substantial evidence, coined by the US Food and Drug Administration (FDA) as part of the Federal Food, Drug and Cosmetic Act, applies mostly to new drug development, but also to products that are already approved. Although it is practically applied to drugs, it is important for the personal care industry to understand its rationale, especially since cosmetic products are becoming more sophisticated. Moreover, the same approach might be used when a claim issue regarding a personal care product arises and the FDA must make a judgment.
When issuing warning letters to manufacturers that request substantial evidence, the FDA is asking for evidence to support all the advertising claims made, including but not limited to comparative claims. In reference to this term, the agency is not referring to the amount of evidence generated, nor to the quantity of persuasive, competent or reliable evidence. Instead, it is referring to the term in its technical sense, i.e., supportive data that will lead to a significant scientific agreement.1 The Federal Food, Drug and Cosmetic (FD&C) Act defines substantial evidence as “evidence consisting of adequate and well-controlled investigations, including clinical investigations, by experts qualified by scientific training and experience to evaluate the effectiveness of the drug involved, on the basis of which it could fairly and responsively be conducted by such experts ... .”
This definition, as noted, is applied to drugs. Considering cosmetics, the manufacturer is not required to generate or submit information to the FDA regarding the biological effect of the product; therefore, there are not clear guidelines with regard to the substantiation of claims made for cosmetics. In fact, cosmetic products, by definition, should not generate any biological effects. If safety issues arise with a cosmetic product, the FDA can order the cessation of distribution of a misbranded or adulterated cosmetic product with a court order, or require companies to place warning labels on their products. However, due to the lack of FDA-filed safety information, and without clear evidence of acute injury, the FDA is not able to enforce warning labels or issue court orders.
As a service to manufacturers and to the general public, the FDA issued the following cautionary statement on its Web site on April 25, 2006: “Be aware that promoting a product with claims that it treats or prevents disease or otherwise affects the structure or any function of the body may cause the product to be considered a drug. The FDA has an Import Alert in effect for cosmetics labeled with drug claims.”2
Defining Structural Change
Considering the FDA’s warning against cosmetics marketed as affecting the structure of the body, it is important to consider: When does the skin undergo structural changes? How does one define changes in skin structure when the skin, with its viable and nonviable layers, is by nature a constantly changing organ? Not only does it exfoliate and lose a layer of corneocytes each day, it is also changing with hormonal cycles, seasons, exposure to daily insult, humidity, temperature and radiation, and age. It is an organ designed to undergo structural changes in order to protect itself and the body it shields.
One could argue, for example, that when soaked in water, the corneocytes in the skin swell and the barrier changes its structure reversibly and temporarily. Studies have shown that daily changes in environmental humidity accelerate epidermal differentiation, barrier homeostasis, epidermal proliferation and inflammatory response.3 Occlusion alone has been demonstrated to affect skin residential flora, moisture and pH.4 Where does one draw the line between structural changes that result normally or from application of formulations to skin, and when do these structural changes become significant enough so as to require FDA approval as drugs? What’s more, during the past two decades, the cosmetic industry has launched a variety of formulas referred to as cosmeceuticals, a category not recognized by the FDA, which are cosmetics intended to impart various functions. Compounds such as glycolic acid, kojic acid and others have been shown to affect skin cell structure, rate of exfoliation, color and tone. Clearly, these are structural changes.
According to the FD&C Act, a cosmetic product is intended to cleanse, beautify and promote attractiveness or alter appearance. When a product also claims to treat or prevent disease, or to affect the structure of any function of the human body, it is considered a drug. Hence, does the fact that cosmeceuticals affect the skin structure and function, yet are not declared or claimed as such, mean that the industry is complying with the regulations? Is the industry misleading consumers or fulfilling their needs? And are these needs physical or psychological?
The Function Premise
While the industry may be clear on safety studies, there are no guidelines as to efficacy, since cosmetic products are not supposed to generate efficacy. Therefore, the marketing need and consumer demand for the product to “act” leads to confusion and a lack of consistency among manufacturers. The question is: How does one prove the efficacy of a cosmetic product if no standardized, reviewed studies are required? The lack of federal guidelines leaves room for companies to conduct studies that are not only inadequately designed, but may be also feebly analyzed and translated into claims.
Such poorly translated claims often mislead consumers to believe that a product can act in a way that it does not. For instance, the current competitive market is flooded with products that claim to change the appearance of skin in such as way that consumers may interpret these claims as changes that occur beyond the skin surface. And with the emergence of products that do alter skin appearance, perhaps beyond the surface, consumers expect great things to happen when they invest in such a pledge.
The problem is both ethical and economical. In terms of economics, developing new cosmetic products requires a significant investment, so if the product does not deliver on its premise, it will result in the consumer’s one-time purchase, thus minimizing its shelf life.
The purpose of this paper is not to settle these issues, nor is it to investigate or take a stand; it is intended simply to address key aspects in studies conducted by cosmetics manufacturers to delve into a few of the inaccuracies that occur when translating the data into claims. Regardless of regulations, this author believes that utilizing scientific knowledge, as the FD&C Act describes for drugs, the production of safe products that are marketed ethically should be the foundation of the cosmetics and personal care industry.
In vitro Studies
The generation of in vitro data is the basis of any biological and clinical research. Understanding the effect of a substance on cellular cascades in cell culture or on ex vivo tissue is essential both for safety assessment and for understanding the mechanism of action for a substance. However, the data generated from in vitro studies cannot be translated into claims that point to organ-related or systemic activity in vivo. A few of the key reasons are as follows.
Isolated vs. systemic enzyme behavior: Enzymes can behave differently when isolated than within their natural environment in the body. In nature, enzymes tend to be present in clusters, and more than one enzyme is found to be responsible for a chain reaction that involves multiple steps. When enzymes are removed from their natural environment in the body, it is not only their stability that may be compromised, but also the reactions they catalyze since these may not be represented in the in vitro conditions. In addition, enzymes can be compartmentalized and embedded into an organelle that may limit their activity or duration, or control their chemical or physical orientation. When isolated, the enzyme is free to act in its entirety—a condition that may differ from in vivo. Moreover, many of the enzymes used are not derived from human sources. For example, the enzyme often used to follow melanin generation from its substrate, tyrosinase, is typically derived from mushrooms.5
Testing in cell cultures: When testing the effect of a compound in cell culture, a few factors should be considered regarding the translation of data into claims. For instance, when a substance is applied to the skin, it partitions into the skin’s different layers and may or may not penetrate through them. An intact, healthy skin serves as an excellent barrier and repels the penetration of most compounds. If the culture used consists of living cells that reside in the epidermis or dermis, one assumes that the compound tested has reached the layers where the cells reside. Therefore, evidence should be generated to prove that the compound penetrates the skin to interact with those cells. If not, the data generated in culture may not be valid.
The skin is a complex organ in which each layer differs from the other. When assessing its nature from the uppermost layer, the stratum corneum (SC), there are a few gradients deeper into the dermis to consider. These include: the water gradient, 10–20% in the SC and up to 70% near circulatory entities such as the capillaries; pH gradient, which is acidic at the SC and neutral at the circulation; and lipid gradient, with ceramides at the SC and phospholipids at the epidermis and dermis. Cell culture media are water-based, and therefore, the compound solubility and its proximity to the cell membrane in the cell culture are different in nature from the interaction with cells that are part of the whole skin as an organ.
Also, it is important to consider that in healthy skin, approximately one layer of corneocytes is lost every day. Normal skin cell turnover is around 30 days. This process thus dilutes the test compound that would eventually penetrate the skin and reach the cells. Human skin is constantly exposed to external insults; some, such as radiation, may penetrate the epidermis and dermis. However, in culture, unless specifically required in the study protocol, there is no exposure to such conditions. An example for an insult that can change the chemistry and physics of compounds applied to skin is radiation. Some compounds, when exposed to radiation, change their chemistry and become toxic. These are called phototoxic compounds. Examples are sunscreens, essential oils and a few classes of antibiotics such as tetracyclines.6
Finally, the skin is a metabolic organ. Although its “metabolic engine”, called the P450 system, and other enzymes are significantly less efficient than those in the liver, in some cases, the compound that is penetrating the skin may undergo metabolism. The metabolite can be as active as the original, less active, inactive or toxic. This metabolism is less likely to occur in culture.
Ex vivo testing: Many of the differences between cell culture studies and in vivo studies also apply to ex vivo and in vitro studies. However, the advantage of working with a whole tissue is that the partitioning and distribution of the compound into the skin is, to a large extent, comparable. Here, as well, several important issues should be considered, such as frozen cadaver skin vs. fresh skin. It has been demonstrated that frozen cadaver skin loses its ability to metabolize most compounds, and the only class of enzymes shown to maintain activity were the esterases.7
Considering dermatomed skin, when studying skin penetration in vitro, only the epidermis is used.8 This tissue provides the rate-limiting step for penetration and therefore, if a compound is penetrating beyond it, it is believed to partition into the circulation. When working with skin samples, the dermis is removed.
Metabolism, again, is another consideration. Although researchers know that the skin is a metabolic organ, the importance of metabolism on compounds that interact with it has yet to be explored. The scientific community has already learned to make no assumptions when it comes to complexity of this organ. For example, the SC was once considered a “dead” sub-tissue that provides only a physical shield. Today it is believed to not only release enzymes from its flora that are nourished by sweat and sebum, but also, in its base, to release enzymes that are responsible for exfoliation through breaking the desmosomal junctions that connect cells covalently.9 When the skin is removed from the body, as explained, its capability to metabolize is diminished significantly.
With diffusion cells, the diffusion driving force in the receiver should also be considered when ex vivo skin is used for testing. The model for diffusion cells to examine skin penetration involves the use of a receiver compartment that mimics blood circulation. This study is based on the assumption that a compound will pass through a membrane such as the skin from a high concentration to the sink, or reservoir in the receiver compartment, following Fick’s law for simple diffusion.10 Therefore, the penetration profile will be highly affected by the content of the receiver compartment.
The higher the solubility of the tested compound in the receiver content, the higher the penetration rate. Therefore, the rate of penetration tested in this model can be artificially modified by changing the solvent in the receiver compartment to one that either allows for better solubilization of the active or retards it. Finally, exfoliation should be a key consideration when working ex vivo. When removed from the body, frozen cadaver skin or fresh skin lacks natural blood nourishment and exfoliation. This means the substances applied to skin will remain there because they are not drained by the circulation or removed by natural exfoliation.
Stability is a relative term. A compound may be stable under certain conditions yet unstable under others. The active compound’s stability can change when exposed to different environments, including: the type of formulation, pH, temperature of production, interaction with other ingredients, and mode of application, among others. Unless the stability of the active compound is tested and confirmed in the final formulation, it is difficult to estimate whether the compound has maintained its original form over time in the formulation.
In this sense, a study to substantiate claims should be conducted with the finished formulation, preferably in its finished packaging since the packaging can influence the content of the formulations; for instance, some compounds can absorb or interact with plastics. In addition, the packaging can affect the amount of cream or lotion released for application to the skin.
Site of Activity
In drug development, when a compound is found to act on an unanticipated site, it may generate unpredicted adverse effects. Therefore, any compound that exhibits biochemical activity should have a recognized, defined site or sites of action at the tissue and cellular levels, and the amount of compound required to generate an effect should be determined. For example, if the claim is made that an active substance inhibits collagenase activity, the following considerations should be tested during its development:
- Skin penetration to dermis layer, where this enzyme is released by fibroblasts;
- The specific location of the enzyme, in the cell or in the tissue;
- The amount of compound required to generate a significant effect;
- The duration of activity;
- The compound’s distribution, metabolism and elimination from the site of action; and
- The compound’s effectiveness vs. its toxic potential (the therapeutic index).
Clinical Study Design
Regarding clinical studies, the study panel should be large enough to generate statistics. In addition, the study should be designed to test a hypothesis and to monitor and assess the events after panelists are exposed to the formulation. When choosing a panel size, the following should be considered: type of measurements, number of products tested, site of testing, and variations within the group of panelists such as age, gender and ethnicity.
There are two types of clinical studies, observational and randomized controlled. The aim of a randomized study is to provide the most compelling evidence that the study treatment causes the expected effect on human health. In an observational study, the investigator merely observes correlations between treatment and status of the effect. Statistical evaluations should demonstrate significant differences in the outcomes among the different groups tested. The number of panelists required to provide statistically significant results will depend on the question that the trial intends to answer. The number of panelists enrolled will have a large bearing on the ability of the study to reliably detect the size of the effect of the study intervention. This is described as the power of the study. The greater the number of participants in the trial, the greater the statistical power.
Pilot-sized studies typically involve from 2 to 10 subjects and are conducted to gain insight for the design of a clinical trail to follow. The results from pilot run studies, if published, should clarify their limitation and the conclusion should include a caveat advising that additional studies are required before any claim is fully substantiated.
Subjective vs. Quantitative Analysis
Many of the studies conducted in cosmetics R&D require subjective, expert trained evaluation. This evaluation is mostly determined visually and is largely based on the experience of the individual who grades the results. Such an example is the human repetitive insult patch test (HRIPT) that is conducted routinely to determine the potential of a formula to generate adverse reactions such as primary irritation or skin sensitization (allergy). In this type of study, scoring is typically rated from 0 (no reaction) to 4 or 5, indicating a severe reaction. As noted, the evaluation is conducted visually and the expert grading the reaction compares the skin’s appearance before and after exposure to the test formulation. The parameters considered are: color (redness) and morphology (generation of skin changes that appear when it is compromised).
Clearly, the expert’s knowledge and experience is crucial for this type of subjective evaluation. Therefore, the panel size should be relatively large to assure accuracy. In addition, one should consider adding an objective quantitative evaluation to the study. In the case of HRIPT, the following measurements can be added:
- Skin redness, measured by chromameter as the a value;
- Transepidermal water loss (TEWL), which can be indicative of a reduction in barrier integrity;
- pH, which can change when the skin is compromised; and
- skin temperature, which can be elevated when the skin is inflamed.
Consumer Feedback and Questionnaires
Questionnaires are employed to gather opinions and learn panelists’ behavior when using a product. However, when designing or analyzing questionnaires, researchers should consider a few parameters. First, the choice of panel is important, and if the same panel is to be used for quantitative measurements, the correlation between panel response and the instrumental evaluation must be examined. Also, the nature of the questions and their semantics are crucial; the study’s reproducibility and ubiquitousness should be contemplated. In fact, as an aside, it has also been shown that in many cases, panelists are thought to be forced into filling out a questionnaire, leading to “questionnaire fatigue,” which results in unusable data.11
Professionally designed questionnaires should be developed with measurement scales by field experts, and statistical evaluation should be employed once results are obtained. Questionnaire language should be considered carefully so as not to be misleading or diverting. Individual words that are dropped into a sentence in different sequences can provide different contexts. Linguists have discovered that individuals’ views of the world and accumulated knowledge may be captured in their language. This presents a challenge to the questionnaire developers, as they must ensure that the questionnaire is appropriate to the language and the culture of its participants.
When the questionnaire is intended to evaluate the results from the use of personal care products such as a creams or lotions and asks the panelist to evaluate measurable quantitative improvements such as a reduction in the appearance of lines and wrinkles, it is important to note that other subjective considerations may come into play. These may include the scent of the product, its packaging, the tactile feel and the time to absorb into the skin. Since these factors are a part of the individual’s subjective experience, they cannot be captured entirely, although an overall opinion from the subject could provide additional information.
Activity of Raw Materials vs. Finished Formulas
A study conducted to substantiate a claim of an individual active compound cannot be used to substantiate a claim when used in a different formulation that contains a combination of active ingredients. This is due to the fact that the combination can counteract its activity, or impart additive or synergistic activities, which can sometimes lead to safety issues. Moreover, the vehicle itself can alter the performance of the active. By changing properties such as the pH, solubility and viscosity, not only can the characteristics of the active be revised, but also its partitioning and distribution in the formulation on and into the skin. The vehicle can affect the resistance of the SC to penetration and can largely modify the clinical effect by either enhancing penetration, retarding or preventing it, or prolonging or delaying it.
The initiative to write this paper arose from concern for the direction that the cosmetics and personal care industry has taken in the past few decades. The industry has become more sophisticated in many ways—in science, marketing and public relations, regulations, and most importantly, in safety assessment. However, sophistication does not always accompany wisdom, or ethics or even validity. Most individuals will periodically ask important questions that are related to profitability, globalization and the gain of power and presence. These are essential because the cosmetics and personal care industry addresses an important need in society and holds weight in the overall economy. However, at times, there is a need to pause and ask equally important questions about the nature of the business we are in.
Is the industry servicing the public’s best interest in the best manner? Is it producing safe products that deliver their premise? Where and how can the industry improve?
This paper merely raises a few points and provides examples to consider when developing and marketing a personal care formula. In real life, no one situation resembles another. Each product development process has its own scope and limitations. The industry should strive to utilize the tools available to improve evaluation and product development.
Send e-mail to CT_Author@allured.com.
- A Friede, Recent warning letters for ads reflect FDA’s fixation on “substantial evidence,” Washington Legal Foundation 22(31) 1–4 (2007)
- Cosmetic Labeling and Label Claims overview: What about therapeutic claims?, FDA Web site, available at www.fda.gov/Cosmetics/CosmeticLabelingLabelClaims/default.htm (accessed Jan 27, 2010)
- M Denda, Skin barrier function as self organizing system, Review Forma 15 227–232 (2000)
- AA Hartmann, Effect of occlusion on residential flora, skin moisture and skin pH, Arch Dermatol Res 275 251–254 (1983)
- I Kubo, QX Chen, KI Nihei, J Calderon and CL Cespedes, Tyrosinase inhibition kinetic of anisic acid, Z Naturforsch 58 713-718 (2003)
- RK Hans, N Agrawal, K Verma, RB Misra and RS Ray, Assessment of the phototoxic potential of cosmetic products, Food and Chemical Toxicology 46(5) (2008) pp 1653–1658
- P Savapichayont, In vitro viable skin model development to assess cutaneous delivery and metabolism of ester type compounds, doctoral thesis, College of Pharmacy, University of Saskatchewan (Apr 2000)
- OECD Guidelines for dermal in vitro absorption testing #428, Organization for Economic Cooperation and Evaluation (2004), available at http://ec.europa.eu/food/plant/protection/evaluation/guidance/wrkdoc20_rev_en.pdf (accessed Jan 27, 2010)
- N Dayan, The stratum corneum—Dead or alive, NY SCC newsletter, Cosmetiscope 10 1,6 (2004)
- N Dayan, Dermal absorption guidelines for cosmetic ingredients; An in vitro method, Cosm & Toil 58–64 (Mar 2009)
- KR Larsen, D Nevo and E Rich, Exploring the semantic of questionnaire scales, proceedings of the Hawaii International Conference on System Sciences, IEEE 1–10 (2008)