You are on page 1of 10

J. Soc. Cosmet. Chem.

, 40, 297-306 (September/October 1989)

Theflex washtest:A method for evaluating the mildness of personal washing products
DARCEE DUKE STRUBE, STEPHEN W. KOONTZ, RICHARD I. MURAHATA, and RICHARD F. THEILER,

Unilever Research United States, Inc., 45 RiverRoad,Edgewater, NJ


07020.

Received November 1, 1988.

Synopsis

Various clinical procedures existfor determining the mildness of personal washing products. It is common to useseveral of theseevaluation methods in the development of a safety-and-claim support package. The utility of manyof the methods is limitedby theirsusceptibility to fluctuations in weather conditions. In thispaper we describe a method, theflexwash test,whichis notaffected by changes in weather andcanbe used asa highlyreproducible method for determining the relative irritancy potential of personal washing products. The flexwash testconsists of a sixty-second wash, three times daily,of the antecubital fossa (flex area) of the arm. Washingis conducted for fiveconsecutive days or until a moderate erythemic response is elicited.Erythemais assessed prior to eachwashandfour hours afterthe last daily wash.Twelvecommerciallyavailable personal washing bars wereevaluated in thisstudy.The flexwash is a reproducible clinical testthat distinguishes differences in the relativeirritancy potential of various syndet (synthetic detergent) andsoap barsand is independent of ambient weather conditions.

INTRODUCTION

A varietyof test procedures existfor determining the relativemildness of personal cleansing products on humanskin(1-4). The overall categories for the methods includepatchtesting,exaggerated usetests,and normalusetests.Normal usetestswith soap barshavebeenconducted (1), but they requirelargepanelsizes in orderto differentiatebetweenvaryinglevelsof performance. This can become very expensive and time-consuming for routine screening of marketedproducts and new formulations. Patchtestmethods like the soap chamber test(2) canbe useful in distinguishing differences in relativeirritancypotential.However,the authors reportthat surfactants can respond differentlyunderoccluded versus normaluseconditions. Arm immersion (3) andhalf-face (4) tests moreclosely resemble realistic useconditions but areweather-dependent.Due to weatherdependency factors that affectmost of thesetests,clinical trialswith these methods aremostsensitive duringthe cold, dry winter months.
The datapresented in this paperdemonstrate the reproducibility and utility of the flex washtestto accurately discriminate the relativeirritancypotential of personal washing
297

298

JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS

products regardless of the ambientweather conditions. The studies reported herewere conducted in the New York-New Jersey area.

MATERIALS

AND

METHODS

Fourteen commercially available personal washingbars(designated barsA-N) were tested usingthe flex washtest. Compositional ingredients of these barsasobtained from the label are providedin Table I. Bar A was includedas a reference sincestudies reportedby others(2) demonstrated the mildness of this bar.

The subjects weremaleandfemale volunteers between 20 and55 years old. All subjects werein general goodhealthwith no historyof dermatologic conditions. Informedconsentwasobtainedprior to the initiation of the test. The antecubital fossae (flex area)of the armswerefree of cutsand abrasions, with no irritationpresent at the onsetof the study.Twelveto twentysubjects wererandomly selected from a groupof 200 for each directcomparison of two bars.The test groupwasthen dividedinto two subgroups, whichwerebalanced for handdominance. GroupI usedproduct1 on the left arm and product2 on the right, while groupII usedthe sametwo products but on opposite
arms.

All washtreatments wereconducted by the subject undersupervision in the laboratory. The sponge (JAECE Identi-Plugs,sizeD foam test tube plug) and the cleansing bar weremoistened with tap water (approximately 100 ppm total hardness, maintained at 92 ___ 4F)immediately before use.The sponge wasstroked overoneof the testbars ten timesby the studymonitorandplaced in the subject's right hand.The left flexarea wasmoistened with tap water andgentlywashed for 60 seconds. The washing procedure was an elliptical motion with 120 strokes per minute. The flex areawas then
Table I

Composition of CommercialPersonal Washing Bars


Bar

code
conut acid, sodium stearate

Predominant ingredients

sodiumcocoyl isethionate, stearic acid, sodiumtallowate,water, sodiumisethionate, cosodiumtallowate,potassium soap,water triethanolamine soap,sodium tallowate,glycerin sodium talowate,sodium cocoate, water,coconut acid,sodium polyacrylate, glycerin,cocoa
butter
E F G

H
I

J
K
L

sodiumcocoate, sodiumtallowate,water, glycerin,coconut acid sodiumtallowate,sodiumcocoate, glycerin,water, coconut acid sodium tallowate,sodium cocoate, water,PEG-6 methylether,triclocarban, glycerin dextrin,sodium laurylsulfoacetate, water,boricacid,urea,sorbitol,mineraloil, PEG-14M sodiumtallowate,sodiumcocoate, water, petrolatum,glycerin sodiumtallowate,sodiumcocoate, water, mineraloil, PEG-75, glycerin,lanolinoil sodiumtallowate,sodiumcocoate, water, vegetable oil sodiumtallowate,sodiumcocoate, water, sodiumcocoglyceryl ether sulfonate, glycerin,
coconut acid

M N

sodium tallowate,sodium cocoate, sodium cocoglyceryl ethersulfonate, glycerin,coconut or palm kernalacid, triclocarbon, polyquaternium-7 sodiumtallowate,sodiumcocoyl isethionate, water, sodiumcocoate, stearic acid, triclosan

METHOD

FOR SKIN

IRRITATION

299

rinsedfor approximately 10 seconds underrunningtap water until all lather wasremoved andthenpatteddry with a softdisposable towel. The procedure wasrepeated on the right arm usingthe left handwith the otherproductbeingtested.The test sites weretreatedthreetimesdaily with 1.5 hoursbetween washings for five consecutive days.
The test siteswereevaluated by a trainedexaminer for irritation immediately prior to eachwashand four hoursafterthe third daily wash,for a total of 20 evaluations. Sites were gradedusinga seven-point scoring system (0-3). Dryness is not scored in this methodasflakesare removed by the application procedure.
Grade 0 + 1 1+ 2 2+ 3 Description Normal, no erytherna Barelyperceptible erytherna Mild erythema,no edema Mild to moderate erythema,with/without edema Moderateconfluent erythema,with/without edema Moderateto deeperytherna,edema Deep erythema,edema,vesiculation

Eachsitewastreatedin the prescribed methodfor a maximumof fifteenwashes or until a moderate confluent erythernic response (Grade2) waselicited.When a gradeof 2 or greater(Figure la) wasattained,treatmentof the site wasdiscontinued. Continuing treatmentbeyondthis point quickly resultsin a severe response and discomfort to the subject.The remainingflex areawaswashed until Grade 2 erythemawasattainedor fifteentreatments completed. Subjects were restricted from applyingany rnoisturizing products to their arms. This included the useof bodylotions,sunscreens, andbathoils. Additionally,subjects were instructed not to wash the testsites with soap duringbathing.Sunbathing wasprohibited during the testweek.
STATISTICAL ANALYSIS

The datawasanalyzed by fourmethods. When an endpoint wasreached, that score was carriedthroughfor all remainingevaluations. Mean total irritation score and standard deviationwerecalculated for eachbar tested.This "meanerythema score" wasusedfor general intercomparisons between testsand to providea graphicpictureof a bar'sperformance during the test week.

For all statistical analyses, when a subject's first site reached an endpointscore,the scores for bothsites werecarried through for all remaining evaluations. However,treatmentof the remaining sitewascontinued. The following statistical methods wereused:
1. A signtest utilizing the binomialequation wasusedto evaluate datafor only those subjects who reached an endpoint.This methoddetermined whichproducttreatment resulted in achieving this endpoint first for erythema; dryness wasnot evaluated. Subjects who were able to complete all washtreatments on both armswere
considered a tie.

2. The Wilcoxon matched pairstest wasusedto compare erythemascores at the time of first site termination regardless of the numberof treatments. The Wilcoxon test wasalsousedto compare scores at eachobservation point.

300

JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS

Figure la. The same panelist's right flexarea that waswashed ninetimeswith barB. This flexexhibits an
irritation score of 2.

3. A survival testwasconducted on the number of washes a sitewasexposed to prior to


termination.

RESULTS

Irritation due to the useof the testmaterialdevelops overthe five-daytestperiod.The

clinical appearance of the flexareaundergoing repeated washing with a cleansing bar

METHOD

FOR SKIN

IRRITATION

301

Figure lb. A panelist's left flex areathat has beenwashed nine times with bar A. No indicationof
irritation exists.

ranges from normal(Grade0), to deeperythema with edema (Grade3). A score of Grade2 is the termination point in the flex washtestdue to panelist discomfort. A characteristic response for Grade0 andGrade2 is shown in Figure 1.
The skin'sresponse to the treatment increases at differentratesdepending on the relative irritancypotentialof the bar. Table II provides a summary of four studies comparingbarswith varyingdegrees of relativeirritancypotential.The increase in mean

302

JOURNAL OF THE SOCIETY OF COSMETIC CHEMISTS


Table II

Summary of Erythema Scores


Evaluation number

Ave. #

12

16

20

Test
Bar A Bar H
Bar A

Washes
19 18
18

(Day 1)
2.0
4.5
0.0

(Day 2)
4.5 7.0
2.0

(Day 3)
4.5
10.0'
7.0

(Day 4)
6.5
11.5
7.5

(Day 5)
6.5
12.5
8.0

Bar I Bar A Bar B

14 20 10

0.5 0.5 4.0*

11.0'* 1.5 24.5**

21.5'* 3.0 31.0'*

21.5'* 3.5 32.0**

23.0** 6.5 32.5**

Bar A Bar E

20 12

0.5 3.0**

1.0 20.5**

4.0 26.5**

7.0 32.0**

9.5 32.5**

All endpoint scores of 2.00 or greater werecarried throughfor remaining evaluations. Valuespresented are informational only. Statistical analysis wasperformed usinga Wilcoxonmatched pair test. * Significantly differentat p < 0.05. ** Significantly differentat p < 0.01.

total scoreshowsthe rate of development of the observed irritation. The average numberof washes is the numberthat couldbe conducted with that productbeforea moderate response (Grade2) waselicited.Bar B develops a response quickly, with the average score approaching the maximum valueby thetwelfthevaluation (day3). Milder bars H and A exhibit a gradualincrease in response without the sitesreachingthe maximum endpoint score. The four-hour restperiod between the third dailywash and the final dailyevaluation allowsthe response to develop, resulting in the highest daily score.Recovery of the test sitesoccurs during the sixteen-hour overnightrest period between consecutive washdays.This results in plateaus in the response curve.Graphic

representation (Figure2) shows the sigmoidal patternthat typicallybecomes more prominent as the irritancy potential of the bar increases. Use of moreirritatingbars
resultsin fewer survivors (lessthan Grade 2) at any time. Figure 2 also shows that repetitive testingof a barproduces verysimilarresponse curves in tests duringdifferent
seasons.

This absence of seasonal variation andthe reproducibility of the flexwash testis shown in Figure3. The graph shows themean totalerythema score obtained for two products of differentrelativeirritancypotentials overa fourteen-month time period.Meanscores for barsA and B were5.7 --- 1.2 and 27.7 -+ 2.0, respectively. The percent of panelists completing the studyis also veryconsistent, with 93% --- 6% completing the studies with bar A and 5% --- 3% completing testing with bar B. Eleven marketed personal washing bars weredirectly compared to bar A usingthe flex washtest. The percentage of survivors and the meannumberof evaluations completed for eachpersonal washingbar is shownin Table III. Statistical analysis of the data indicates that all the barstestedweresignificantly moreirritating than bar A based on the Wilcoxonmatched pairstest. The sign and survival test alsoshowed significant differences between all comparisons exceptA vs H. There were no cases wherethe subjects reached an endpoint score in the bar A-H comparison. Statistical analyses were only conducted on the barsthat weredirectlycompared.

METHOD

FOR SKIN

IRRITATION

303

100

80,

20-

10
EVALUATION

15

20

Figure 2. Rate of irritation development is shown as the percent survivors (lessthan Grade2) at each

evaluation time.Theproduct codes arebarH (I); barC (A); barB--March ((2)); March (C)),July(O).

Based on these data,the irritancy potential of these bars variedsignificantly: bars A and H werethe milderproducts; barsC andL weremoreirritating;andbars B, D, E, F, G, I, J, andK wereharsher. A comparison of the ingredients indicates that barswith high levels of soap weremoreirritatingthanthe syndet (synthetic detergent) or soap/syndet bars.The addition of smallamounts of glycerin, cocoa butter,mineral oil, andpetrolatum did not appear to significantly decrease the relativeirritancypotentialfor these products.

DISCUSSION

Repetitiveclinicalevaluations of two commercially available personal washingbars, barsA andB, demonstrate that the flex washtestis a reproducible assay that is minimallyaffected by localclimatic fluctuations. Bars A andB werechosen for comparisons throughout the yearsince they represent the mildestcommercially available bar and a soapbar of moderate harshness, respectively (2). In eleven evaluations of bar A, the percentage of subjects completing all fifteen washes was93% -+ 6%. Similarly,in five evaluations of bar B, the percentage of subjects completing all fifteen washes was 5% -+ 3%. Althoughsome variationof response occurred, it is mainly attributableto interpanel variability since subjects werenot screened for soap sensitivity. The standard deviation wasless than 10% between tests conducted at different timesof the year.We

304

JOURNALOF THE SOCIETY OF COSMETIC CHEMISTS

32,

JAN

MAR

APR

JUN

JUL

AUG

OCT

NOV

mONTHS

Figure 3. Seasonal comparisons of meantotal irritationscores duringvarious monthsareshown for bar A


(solid) and bar B (crosshatch).

believe the testmethodreduces the influences of stratumcorneum hydration andturnoverdueto the sponge application procedure andtherefore makes the test less weatherdependent.

We havefoundthat the flex washis capable of significantly discriminating between mildercleansing systems thanis typically achievable with a chamber test.Thisresult is illustratedin Table IV, which compares the relativedifferences observed for the flex washanda modified soap chamber test(6) for threesoap bars.
Using the flex washoneobserves statistically significant differences that are not seen usinga modifiedsoapchamber test. Furthermore, the modifiedsoapchamber test is reported to be assensitive asthe testpreviously designed by Frosch and Kligman (7). Currenttrends to develop milderactive systems for personal cleansing andthe ability to moreeasily discriminate between these formulations is an important aspect of anyclinical testingregime. We find that the flex washis capable of discriminating smaller differences in irritationpotential,while goodcorrelation to the soap chamber test can be foundwhenthe differences in irritationpotentialfor two products are large(e.g., soap versus syndet bars). As an example, in these studies we chose barA asa reference standard since Frosch and Kligmandemonstrated that thisproduct wasthe mildestof 18 barstested in the soap chamber test(2). Utilizing the flexwash,barA wasalsofoundto be superior in mildnessto the eleven bars tested, which in overall terms correlateswell with results re-

METHOD

FOR SKIN

IRRITATION

305

Table

III

The Mildness Attributesof Personal WashingBarsUsedin the Flex Wash Test


Mean total Mean number of

Bar
BarA
Bar H Bar L Bar C

erythema score
5.7 -+ 1.2
11.0'** 18.4'** 20.4***

% Survivors*
93 -+ 6
83 31 26

evaluations completed**
19
18 14'** 14'**

Bar K
Bar E Bar G

25.4***
24.4*** 27.8***

15
12 12

7***
12'** 8***

BarJ
Bar B

26.1'**
27.7 + 2.0***

7
5 -+ 3

9***
10'**

Bar I
Bar F

27.6***
28.6***

0
0

9***
8***

Bar D

29.4***

7***

* Data represents the average numberof panelists completing15 washes.

** Data represents the average number of evaluations (maximum of 20) the panelists completed in each
experiment for a particular product. *** Significantly different (p < 0.05) frombarA usinga survival testfor the meannumber of evaluations

completed anda Wilcoxon matched pairtestfor the mean totalerythema scores.

ported for the soap chamber test.Bars B, C, G, J, andK wereclassified as"slightly irritating"in the original soap chamber test.Bars H andI wereclassified "moderately irritating." However, while many of the differences amongproducts in both the chamber testandflexwash arenot significantly different, some movement in the rank orderfrom testto testwouldbe expected. In conclusion, theflexwash testisa reliable method forevaluating therelative irritancy
Table IV

A Comparison of ErythemaScores for Three Bar Soaps Usinga ModifiedSoapChamber Test


and the Flex Wash

Modifiedsoap chamber test


Erythemascore
Bar G Bar M 1.09 1.09

Significance*
N.S. N.S.

Bar N
Flex wash

0.94

N.S.

Mean end point erythema


Bar G Bar N
Bar M

Significance**

5.83

2.2O
1.69

p 0.01 p < 0.08

Bar N

1.18

* Comparisons of means usingDuncan's Multiple RangeTest. ** Statistical analysis of rank scores usingthe Wilcoxon2 sample test.

306

JOURNAL OF THE SOCIETYOF COSMETICCHEMISTS

potentialof personal washing products. The testis reproducible across a rangeof local seasonal variations. Producttesting, asperformed in the flex washtest, allowsnatural lather development from the bar and closely mimicsproductconcentrations during homeusage. Subject compliance is excellent since useof theflex areaallows thepanelist to followa normalcosmetic and skin careregimenon the faceand hands.The methodis usefulas a screening test when evaluating mild surfactants and other mildness agents, with directionality evidentby the third day of the test. While the test hasbeendesigned to discriminate smalldifferences in irritationpotential,it lacksthe ability to separate products based upondryness.

ACKNOWLEDGMENTS

The authorswish to thank Ms Joan Barrowsand Ms DoloresBorowskifor their technicalexpertise, and Dr. Gary Grovefor hisconsultation in the statistical analysis of the
data.

REFERENCES

(1) S. H. Peck,J. Morse,T. Cornbleet, E. Mandel,and I. Kantor, Soap--Neutral vs alkaline,Skin, 1, 261 (1962). (2) P. Frosch andA.M. Kligman,The soap chamber test--A newmethod for assessing the irritancy of soaps,J.Am. Acad.Dermatol., 1, 35-41 (1979). (3) J. D. Justice,J. J. Travers,and L. J. Vinson,Testingdetergent mildness, Soap Chem. Specialties, 37(8), 53-56 (1961).

(4) P. J. Frosch, "Irritancy of Soaps andDetergent Bars," in Principles ofCosmetics for the Dermatologist, P.
Frosch and S. N. Horwitz, Eds. (C. V. MosbyCo., St. Louis, 1982), pp. 5-12. (5) M. F. Lukacovic,F. E. Dunlap, S. E. Michaels,M. O. Visscher,and D. D. Watson, Forearmwash test to evaluatethe clinical mildnessof cleansing products, J. Soc.Cosmet. Chem.,39, 355-366
(1988).

(6) S. W. Babulak, L. D. Rhein,D. D. Scala, F. A. Simion,andG. L. Grove,Quantitation of erythemain a soap chamber testusingthe Minolta Chroma (Reflectance) Meter: Comparison of instrumentalresults with visual assessments, J. Soc. Cosmet. Chem., 37, 475-479 (1986).
(7) G. L. Grove, personal communication.

You might also like