• Nebyly nalezeny žádné výsledky

dataset P300 Evaluation of neural convolutional networks using a largemulti-subject Biomedical Signal Processing and Control

N/A
N/A
Protected

Academic year: 2022

Podíl "dataset P300 Evaluation of neural convolutional networks using a largemulti-subject Biomedical Signal Processing and Control"

Copied!
7
0
0

Načítání.... (zobrazit plný text nyní)

Fulltext

(1)

Contents lists available atScienceDirect

Biomedical Signal Processing and Control

j o u r n a l h o m e p a g e :w w w . e l s e v i e r . c o m / l o c a t e / b s p c

Evaluation of convolutional neural networks using a large multi-subject P300 dataset

Lukáˇs Vaˇreka

NTISNewTechnologiesfortheInformationSociety,FacultyofAppliedSciences,UniversityofWestBohemia,Univerzitni8,30614Pilsen,CzechRepublic

a r t i c l e i n f o

Articlehistory:

Received6August2019

Receivedinrevisedform4December2019 Accepted31December2019

Keywords:

Convolutionalneuralnetworks Event-relatedpotentials P300

BCI LDA

Machinelearning

a b s t r a c t

Deepneuralnetworks(DNN)havebeenstudiedinvariousmachinelearningareas.Forexample,event- relatedpotential(ERP)signalclassificationisahighlycomplextaskpotentiallysuitableforDNNas signal-to-noiseratioislow,andunderlyingspatialandtemporalpatternsdisplayalargeintra-andinter- subjectvariability.Convolutionalneuralnetworks(CNN)havebeencomparedwithbaselinetraditional models,i.e.lineardiscriminantanalysis(LDA)andsupportvectormachines(SVM)forsingletrialclas- sificationusingalargemulti-subjectpubliclyavailableP300datasetofschool-agechildren(138males and112females).Forsingletrialclassification,classificationaccuracystayedbetween62%and64%for alltestedclassificationmodels.Whenapplyingthetrainedclassificationmodelstoaveragedtrials,accu- racyincreasedto76–79%withoutsignificantdifferencesamongclassificationmodels.CNNdidnotprove superiortobaselineforthetesteddataset.Comparisonwithrelatedliterature,limitationsandfuture directionsarediscussed.

©2020ElsevierLtd.Allrightsreserved.

1. Introduction

Inrecentyears,bothfundamentalandappliedresearchindeep learninghasrapidlydeveloped.Inimageprocessingandnatural languageprocessing,it hasledtosignificantly betterclassifica- tionratesthanpreviousstate-of-the-artalgorithms[7].Therefore, therehasbeena growinginterest inapplyingdeepneuralnet- works(DNNs)tovariousfieldsofappliedresearch.Suchaneffort canalsobeseeninelectroencephalographic(EEG)dataprocessing andclassification.Awell-knownapplicationofEEGclassification isa brain-computerinterface(BCI)[18]which allowsimmobile persons tooperate devices only by decoding their intentfrom EEGsignalwithoutanyneedfor muscleinvolvement. Asignifi- cantchallengeinBCIsystemsistorecognizetheintentionofthe usercorrectlysincethebraincomponentsofinterestoftenhavea significantlyloweramplitudethanrandomEEGsignal[18].

DNNsoftendonotrequirecostlyfeatureengineering,andthus couldleadtomoreuniversalandreliableEEGclassification.How- ever,recentreviewofthefieldreachedaconclusionthatsofar, thesebenefitshave not beenconvincinglypresented in thelit- erature[14].ManystudiesdidnotcomparethestudiedDNNto state-of-the-art BCImethods or performedbiasedcomparisons, witheithersuboptimalparametersforthestate-of-the-artcom-

E-mailaddress:lvareka@ntis.zcu.cz

petitorsorwithunjustifiedchoicesofparametersfortheDNN[14].

SimilarconclusionhasbeenreachedinanotherreviewofDNNand EEG[22].Manyrelated paperssufferfrompoorreproducibility:

a majorityofpapers wouldbehard orimpossibletoreproduce given the unavailability of theirdata and code [22]. Moreover, oneofthedrawbacksofDNNsishavingtocollectalargetraining dataset.TypicalBCIdatasetshaveverysmallnumbersoftraining examples,sinceBCIuserscannotbeaskedtoperformthousands ofmentaloperationsbeforeactuallyusingtheBCI.Toovercome thisproblem,ithasbeenproposedtoobtainBCIapplicationswith verylargetrainingdatabases,e.g.formulti-subjectclassification.

Multi-subjectclassificationhasonemoreadvantage—itsolvesthe problemofDNNlongtrainingtimes.Instead,auniversalBCIsys- temcanbetrainedonlyonceandthenjustappliedtoanewdataset fromanewuserwithoutanyadditionaltraining[14].

Guessthenumber(GTN)isasimpleP300event-relatedpoten- tial(ERP)BCIexperiment.Itsaimistoaskthemeasuredparticipant topicka numberbetween1 and9.Then, heor sheisexposed tocorrespondingvisualstimuli.TheP300waveformisexpected followingtheselected(target)number.Duringthemeasurement, experimenterstrytoguesstheselectednumberbasedonmanual evaluationofaverageERPsassociatedwitheachnumber.Finally, boththenumbersthoughtandguessesoftheexperimentersare recordedasmetadata.250school-agechildrenparticipatedinthe experimentsthatwerecarriedoutinelementaryandsecondary schools intheCzechRepublic.Onlythree EEGchannels(Fz,Cz,

https://doi.org/10.1016/j.bspc.2019.101837 1746-8094/©2020ElsevierLtd.Allrightsreserved.

(2)

Pz)wererecordedtodecreasepreparationtime.Nevertheless,to theauthor’sbestknowledge,thisisthelargestP300BCIdataset availablesofar[19].

Themainaimofthispaperistoevaluateoneofthedeeplearning models,convolutionalneuralnetworks(CNN)forclassificationof P300BCIdata.Unlikemostrelatedstudies,multi-subjectclassifica- tionwasperformedwiththefuturegoalofdevelopingauniversal BCI.Twostate-of-theartBCIclassifierswereusedasbaselineto minimizetherisk of biased comparison.To avoid overtraining, cross-validationandfinaltestingusingapreviouslyunusedpart ofthedatasetwereperformed.Anotheraimofthismanuscriptis toevaluatesomeCNNparametersinthisapplication.

1.1. State-of-the-art

AlthoughvariousBCIalgorithmshavebeenevaluatedandpub- lished in recent decades,there is still no featureextraction or machinelearningalgorithmclearlyestablishedasstate-of-the-art.

However,severalstudieshavefocusedonreviewsandcomparisons withpartlyconsistentresults.In[12],acomparisonofseveralclas- sifiers(Pearson’scorrelationmethod,Fisher’slineardiscriminant analysis(LDA),stepwiselineardiscriminantanalysis(SWLDA),lin- earsupport-vectormachine(SVM),andGaussiankernelsupport vectormachine(nSVM))wasperformedon8healthysubjects.It wasshownthatSWLDAandLDA achievedthebestoverallper- formance.AsoriginallyproposedbyBlankertzetal.[3]andalso confirmedinarecentreview[14],shrinkageLDAisanotheruseful toolforBCI,particularlywithsmalltrainingdatasets.In[16],the authorsdemonstratedthatLDAandBayesianlineardiscriminant analysis(BLDA)wereabletobeatotherclassificationalgorithms.

Effortstodevelopauniversalmulti-subjectP300BCImachine learninghave beenrelatively rarein theliterature. In [21],the authorsdeveloped a genericshrinkage LDA classifier using the trainingdataof18subjects.Theperformancewasevaluatedwith thedata of 7 subjects.It wasconcluded that generic classifier achievedcomparableresultsregardingtheeffectivenessandeffi- ciencyaspersonalizedclassifiers.

2. Methods

2.1. Dataacquisition

Thedatadescribedindetailandaccessiblein[19]wereusedin subsequentexperiments.Themeasurementsweretakenbetween 8amand3pm.Unfortunately,theenvironmentwasusuallyquite noisysincemanychildrenandalsomanyelectricaldeviceswere presentintheroomatthesametime.However,inanycasethere werenopeoplestandingormovingbehindthemonitororinthe closeproximityofthemeasuredparticipant.

Theparticipantswerestimulatedwithnumbersbetween1and9 flashingonthemonitorinrandomorder.Thenumberswerewhite ontheblackbackground.Theinter-stimulus intervalwassetto 1500ms.Thefollowinghardwaredeviceswereused:theBrainVi- sionstandardV-Ampamplifier,standardsmallormedium10/20 EEGcap,monitorforpresentingthenumbers,andtwonotebooks necessarytorunstimulationandrecordingsoftwareapplications.

Thereferenceelectrode wasplacedattherootof thenoseand the ground electrode was placed on the ear. To speed up the guessingtask,onlythree electrodes,Fz,Czand Pz,wereactive.

Thestimulationprotocolwasdeveloped andrunusingthePre- sentationsoftware tool produced by Neurobehavioral Systems, Inc. The BrainVision Recorder wasused for recording raw EEG data.

Theparticipantswereschool-agechildrenandteenagers(aged between7and17;averageage12.9),138malesand112females.All

Fig.1.Comparisonoftargetandnon-targetepochgrandaverages.Asexpected, thereisalargeP300componentfollowingthetargetstimuli.NotethattheP300 averagelatencyissomewhatdelayedcomparedtowhatiscommonlyreportedin theliterature[15].

participantsandtheirparentswereinformedabouttheprogramme ofthedayandtheexperimentscarriedout.Allparticipantstook partintheexperimentvoluntarily.Thegender,age,andlaterality oftheparticipantswerecollected.Nopersonalorsensitivedata wererecorded.

2.2. Preprocessingandfeatureextraction Thedatawerepreprocessedasfollows:

1.Fromeach participant ofthe experiments,shortparts ofthe signal(i.e.ERPtrials,epochs)associatedwithtwonumbersdis- playedwereextracted.Oneofthemwasthetarget(thought) number.Anotheronewasrandomlyselectednumberoutofthe remainingstimulibetween1and9.Consequently,similarnum- beroftrainingexamplesforbothclassificationclasses(target, non-target)wasextracted.Theextracted epochswerestored intoafile(availablein[20]).

2.For epoch extraction, intervals between 200ms prestimulus and1000mspoststimuluswereused.Theprestimulusinterval between−200and0mswasusedforbaselinecorrection,i.e.

computingaverageof thisperiodandsubtractingit fromthe data.Thusgiventhesamplingfrequencyof1kHz,11,532×3× 1200(numberofepochs×numberofEEGchannelsxnumberof samples)datamatrixwasproduced.

3.Toskipseverelydamagedepochs,especiallycausedbyeyeblinks orbadchannels,amplitudethresholdwassetto100␮Vaccord- ingtocommonguidelines(suchasin[15]).Anyepochx[c,t]with cbeingthechannelindexandttimewasrejectedif:

maxc,t |x[c,t]|>100 (1)

Withthisprocedure,30.3%ofepochswererejected.InFig.1, grandaveragesofacceptedepochs(acrossallparticipants)are depicted.

Featureextraction.ManydeeplearningmethodssuchasCNN aredesigned toavoid significantfeatureengineering [2,28]. On theotherhand,linearclassifiersusuallyperformbetterwhenthe dimensionalityoftheoriginaldatamatrixisreduced,andonlythe mostsignificantfeaturesareextracted[3].Intheparameteropti- mizationphase,state-of-the-artclassifierswereusedeitherwith originaldatadimension,orafterfeatureselectionproposedin[3]

tocomparetheperformance.Thefeatureextractionmethodwas

(3)

Fig.2. Flowchartofpreprocessing,featureextractionanddatasplittingapplied.

Fig.3.Architectureoftheconvolutionalneuralnetwork.Therewasoneconvolutionallayer,onedenselayer,andfinallyasoftmaxlayerforbinaryclassification(target/non- target).Batchnormalizationanddropoutfollowedboththeconvolutionalanddenselayers.

basedonaveragingtime intervalsofinterestandmergingthese averagesacrossallrelevantEEGchannelstogetreducedspatio- temporal feature vectors(Windowed meansfeature extraction, WM).InlinewithrecommendationsforP300BCIs,aprioritime windowwasinitiallysetbetween300and500msafterstimuli[25].

Thistimewindowwasfurtherdividedinto20equal-sizedtime intervalsinwhichamplitudeaverageswerecomputed.Therefore, withthreeEEGchannels,thedimensionalityoffeaturevectorswas reducedto60.Finally,thesefeaturevectorswerescaledtozero meanandunitvariance.

2.3. Classification

Fig.2depictsproceduresusedtoextractfeaturesandsplitthe dataforclassification.

Datasplitting.Beforeclassification,thedatawererandomlysplit intotraining(75%)andtesting(25%)sets.Usingthetrainingset,30 iterationsofMonte-Carlocross-validation(again75:25fromthe subset)wereperformedtooptimizeparameters.Resultsusingthe holdouttestingsetwerecomputedineachcross-validationiter- ation andaveraged attheend oftheprocessing. Noparameter decisionwasbasedontheholdoutset.

LDA.State-of-the-art[3] LDAwitheigenvaluedecomposition usedasthesolver,andautomaticshrinkageusingtheLedoit–Wolf lemma[13]wasapplied.

SVM.Theimplementationwasbasedonlibsvm[4].Bothrecom- mendationsintheliterature[8]andvalidationsubsetswereused tofindtheoptimalparameters.Finally,penaltyparameterCwas setto1,thekernelcachewas500MB,anddegreeofthepolyno- mialkernelfunctionwassetto3.One-vs-restdecisionfunctionof shapewiththeRBFkerneltypeandshrinkingheuristicswereused.

CNN. Convolutional neural networks were implemented in Keras[5].Theywereconfiguredtomaximizeclassificationper- formance onthevalidation subsets. Its structure is depictedin Fig.3.Initially,afterempiricalparametertuningbasedoncross- validation,theparameterswereselectedasfollows:

–Thefirstconvolutionallayershadsix3×3filters.Thefiltersize wassettocoverallthreeEEGchannels.Boththesecondfilter dimensionandnumberoffiltersweretunedexperimentally.

–Inbothcases,dropoutwassetto0.5.

–Theoutputoftheconvolutionallayerwasfurtherdownsampled byafactorof8usingtheaveragepoolinglayer.

–ELUactivationfunction[6]wasusedforbothconvolutionaland denselayersasrecommendedinrelatedliterature[23].Com- paredtosigmoidfunction,ELUmitigatesthevanishinggradient problemusingtheidentityforpositivevalues.Moreover,incon- trasttorectifiedlinearunits(ReLU),ELUshavenegativevalues whichallowthemtopushmeanunitactivationsclosertozero

(4)

Fig.4.DecreaseofclassificationlossbasedonthebaselineCNNarchitectureis shown.Althoughtraininglosskeptdecliningthroughoutall30epochs,validation lossreachedtheminimumafteronlyfiveepochs.Becausethepatienceparameter wassettofive,inthiscase,thetrainingwasstoppedafter10epochs.Asseenfrom thegrowingdifferencebetweentrainingandvalidationloss,furthertrainingwould leadtosubstantialovertraining.

whileensuringanoise-robustdeactivationstate[6].Theparam- eter˛>0wassetto1:

f(x)=

x if x>0

˛(ex−1) if x≤0 –Batchsizewassetto16.

–Cross-entropywasusedasthelossfunction.

–Adam[11]optimizerwasusedfortraining becauseitiscom- putationallyefficient, has little memory requirementsand is frequentlyusedinthefield[22].

–Thenumberoftrainingepochswassetto30.

–Earlystoppingwiththepatienceparameterof5wasused.

3. Results

Asmentionedabove,cross-validationforhyperparameteresti- mation was followed by testing on a holdout set. Accuracy, precision,recallandAUC(areaundertheROCcurve)havebeen computed[10].Inthevalidationphase,theaimwastoreachthe configurationyielding thehighestaccuracy whileensuring it is notattheexpenseofprecisionandrecall.InFig.4,anexampleof searchingforanoptimalconfigurationofCNNweightsandbiases basedonthetrainingandvalidationsetsisshown.

3.1. Effectofparametermodificationsonvalidationperformance FeatureextractionforLDAandSVM.Parameteroptimizationof the classifiers themselves has been discussed above. Addition- ally,differentfeatureextractionsettingswerecomparedregarding theaverageclassificationresultsachievedduringcross-validation.

ResultsofthecomparisonsaredepictedinTable1.Accuracyhad anincreasingtrendwhenthetimewindowgotprolongedto800 and1000ms.Itcanbespeculatedthatthestandardapriopritime windowisnotenoughforcapturingtargettonon-targetdiffer- enceswhenclassifyingchildrendatathatdisplayalargevariety intheirP300components.Asexpected,classificationperformance withWMfeatureswasslightlyhigherthanforpreprocessedepochs withoutfeatureextraction.Basedontheresults,bothLDAandSVM configuredasdescribedabovewiththetimewindowbetween300 and1000mswereusedinthetestingphase.

CNN. The neural network architecture described above was usedasthestartingpoint.However,someparametermodifications wereexploredregardingtheireffectonthevalidationclassification

results.TheresultsareshowninTable2.Performancemostlydis- playedonlysmallandinsignificantchangeswiththeseparameter modifications.Consistentlywith[23],batchnormalizationledto slightlybetteraccuracy.Moreover,theabsenceofbatchnormaliza- tionmadetheresultslesspredictableandmorefluctuatingascan beseeninstandarddeviationofrecall.Anothercleardecreasein performancewasobservedwithoutdropoutregularization.Finally, averagepooling wasbetterthanmaxpooling forthevalidation data.Consequently,theinitialconfigurationdescribedinSection 2.3wasusedfortesting.

3.2. Testingresults

Based on the resultsin Section 3.1, both feature extraction methodfor LDAandSVM, andCNNconfigurationachievingthe bestaverageaccuracyduringcross-validationwereselectedforthe testingphase.Fig.5showstheachievedresults.Alltestedmod- elsachievedcomparableclassificationresults.LDAhadthehighest classificationrecall(around67%).Singletrialclassificationaccuracy stayedwithintherangebetween62%and64%.

Averagingofepochsassociatedwiththesamemarkersisastan- dardERPtechniqueforincreasingsignal-to-noiseratio[15].When averaging,repeatedERPsincludingtheP300areamplifiedwhile continuousrandomEEGnoiseissuppressed.BecauseeveninP300 BCIs, repeatedstimulationis usuallyusedto achievegood per- formance[17],itisworthexploringhowoncetrainedclassifiers cangeneralizetoaveragedepochs.Therefore,consecutivegroups ofonetosixneighboringepochsfromthetestingsetwereused insteadofsingletrials.Fig.6depictstheresultsachieved.Withaver- aging,classificationaccuracyincreasedfromoriginal61–64%upto 76–79%.Therewerenosignificantdifferencesamongclassifiers, althoughCNNdisplayedslightlyhigherstandarddeviations.

4. Discussion

Singletrialclassificationaccuracywasbetween62%and64%

foralltestedclassificationmodelswithoutsignificantdifferences.

Similarresultshave been commonly reportedin theliterature.

Forexample,in[9],65%singletrialaccuracywasachieved(using onetothreeEEGchannelsandpersonalizedtrainingdata).In[24], 40–66%classificationaccuracywasreported,highlydependenton thetestedsubject.Comparably,thismanuscriptachievedsimilar performanceforalargemulti-subjectdatasetofschool-agechil- dren.

Onsingletriallevel,CNNachievedcomparableperformanceto bothLDAandSVM.Similarperformancewasalsoachievedwhen applyingaveragedtestingepochs.However,CNNseemedslightly lessstableandmoredependentontraining/validationsplitascan beseeninstandarddeviations.

Consistentlywithrelateddeeplearningliterature[23],acombi- nationofELUs,dropoutandbatchednormalizationwerebeneficial forclassification performance.Unlike many imageclassification applications,averagepoolingwasbetterthanmaxpooling,per- hapsbecauseitisnotassociatedwithdataloss.Evenlessprominent featuresmaycontributetoclassifierdiscriminativeabilities.Tofur- therverifyhowtheCNNwasabletoclassifybetweentargetsand non-targets,thenetwork wasexposed toall target,or allnon- target patterns.Average hidden layeroutputs (the 4th average poolinglayerusedasanexample)acrosstheseconditionswere calculatedandshowninFig.7.Thereisacleardifferencebetween someCNNoutputsalthoughthemostremainstableacrossboth conditions.

Inourpreviouswork[27],weappliedstacked autoencoders (SAE)tothesameGTNdataset.Incontrastwiththecurrentwork, manualfeatureextractionusingdiscrete wavelettransformwas

(5)

Table1

Averagecross-validationclassificationresultsbasedonthefeatureextractionmethodwiththeLDAclassifierconfiguredasdescribedinSection2.3.Averagesfrom30 repetitionsandrelatedsamplestandarddeviations(inbrackets)arereported.WMwindowedmeans(timeintervalsrelativetostimulionsetsinsquarebrackets).

Featureextraction AUC Accuracy Precision Recall

WM[300–500ms] 59.56%(1.04) 59.54%(1.04) 59.48%(1.83) 61.69%(2.08)

WM[300–800ms] 60.94%(1.04) 60.93%(1.05) 60.75%(1.9) 63.38%(1.85)

WM[300–1000ms] 61.77%(0.9) 61.76%(0.91) 61.45%(1.9) 64.64%(1.48)

None 61.09%(1.13) 61.08%(1.13) 61.68%(1.67) 59.90%(1.35)

Theboldvaluesdenotetheconfigurationthatyieldedthehighestaccuracy.

Table2

Averagecross-validationclassificationresultsbasedontheCNNparametersettings.Averagesfrom30repetitionsandrelatedsamplestandarddeviations(inbrackets)are reported.CNNconfigurationdescribedinSection2.3wasusedasthebaselinemodel.

Changedparameter AUC Accuracy Precision Recall

None 66.12%(0.68) 62.18%(0.94) 62.76%(1.95) 61.34%(2.63)

RELUsinsteadofELUs 66.36%(0.62) 61.85%(1.15) 62.7%(2.19) 60.1%(3.04)

Filtersize(3,30) 65.84%(0.49) 61.95%(1.18) 62.7%(2.1) 60.5%(3.91)

12conv.filters 66.31%(0.51) 61.83%(1.1) 62.3%(2.21) 61.6%(3.08)

Nobatchnormalization 65.99%(0.77) 60.55%(1.52) 61.02%(3.16) 61.5%(7.21)

Dropout0.2 67.67%(0.65) 60.8%(1.49) 61.33%(2.31) 60.33%(4.0)

Nodropout 68.63%(1.11) 59.49%(1.2) 59.61%(1.93) 60.7%(4.44)

Dense(150) 66.07%(0.8) 61.81%(0.95) 62.33%(1.83) 61.18%(2.49)

Twodensel.(120-60) 65.72%(0.77) 62.11%(0.9) 63.14%(2.03) 59.5%(2.55)

Max-insteadofAvgPool 64.23%(1.15) 58.94%(1.94) 60.22%(4.18) 59.24%(13.76)

Theboldvaluesdenotetheconfigurationthatyieldedthehighestaccuracy.

Fig.5.Testingresultsforsingletrialclassification(errorbarsshowstandarddeviations).

Fig.6.Testingresultswhenaveragingneighboringepochs(errorbarsshowstandarddeviations).

performed. Instead of single trial classification, success rate of detectingthenumberthoughtbasedonmultiplesingletrialclassifi- cationresultswascomputed.Maximumsuccessrateonthetesting datasetwas79.4%forSAE,75.6%forLDAand73.7%forSVM.Itseems thatwhileSAEcombinedwithtraditionalfeatureengineeringand involvingmultipletrialspermarkercanoutperformlinearclassi- fiers,thesamebenefitscannotberepeatedwhenapplyingCNNto singletrialclassificationofrawEEGdata.

Computationalefficiencyisanotherimportantfactortoconsider whenapplyingthemethodsinonlineBCIsystems.Experimental comparisonwasperformedwithIntelCorei7-7700K,fourcores, 4.2GHz,64GBRAMandNVIDIAGeForceGTX1050TiGPU.CNN took46stotrainonCPUand26stotrainonGPU.BothLDAand SVMweremuchfastertotrain,with300and1600ms,respectively.

However,trainingtimeswerenotcriticalinthepresentedexperi- mentsinceanyuniversalclassifierneedstobetrainedjustonceand

(6)

Fig.7. Averageoutputsofthe4th(pooling)layeraredepictedaftertheCNN wasexposedtoalltarget/non-targetpatterns.X-axiscorrespondstoindicesof convolutionalfilters(sixintotal).Y-axisistheoutputofconvolutionoriginallycor- respondingtotimeinformation,afteraveragepoolingfurtherdownsampledbya factorof6.Thereisacleardifferenceinoutputs,mainlyinthebottompartofthe maps.However,manyoutputsseemindependentofclassificationlabels,poorly contributingtoCNNdiscriminationabilities.

notwitheverynewBCIuser.Testingtimeswerecalculatedrelative tooneprocessedfeaturevectorandwerelowenoughforallclas- sifiers(CNNtook0.3mstoclassifyonepatternonCPUand0.1ms onGPU,LDAtook0.1andSVM0.2ms).Itcanbeconcludedthatall testedalgorithmscanbeusedinonlineBCIs.Neuralnetworksare slowertotrainandthiscouldbeaproblemforpersonalizedBCIs, retrainedwitheachnewuser.

Thereareseverallimitationsofthereportedexperiments.As a noise suppressionprocedure, severelydamaged epochs (with amplitudeexceeding±100␮Vwhencomparedtobaseline)were rejectedbeforefurtherprocessing.Whileepochrejectionisbenefi- cialforclassificationaccuracy,ontheotherhand,itwouldalsolead tolowerbit-rateswhenusedinon-lineP300BCIsystemsbecauseof dataloss.ArtifactcorrectionmethodsbasedonIndependentCom- ponentAnalysiswerenotfeasiblebecauseofthelownumberofEEG channels(three).Moreover,thelownumberofEEGchannelscould haveadetrimentaleffectonclassificationperformancebecauseof limitedspatialinformationprovidedontheinput.Anotherpossi- blelimitationwasthattheremightbeanarchitectureofCNNthat wouldleadtobetterclassificationperformanceandhadnotbeen discoveredbytheauthor.However,severalmanipulationsofCNN parametersweretested usingcross-validation,includingadding anewdenselayer,withonlyverymodestchangesinvalidation classificationaccuracy.

Recent review of EEG and DNNs [22] studies reported the mediangaininaccuracyofDNNsovertraditionalbaselinestobe 5.4%.Italsorevealedsignificantchallengesinthefield.Lownumber oftrainingexamplesisacommoncomplaintespeciallyforevent- relateddatathatcontaintherelevantinformationintimedomain.

Inthiscase,onlyasmallfractionofcontinuousEEGmeasurement neartheonsetoftrialscanbeusedandstrategiessuchasoverlap- pingtimewindowstoobtainmoreexamplesinfrequencydomain arenotfeasible.In thecurrentstudy,11,532epochswereused whichisbelowmeannumberofexamples(251,532)andmedium numberofexamples(14,000)inthereviewedpapers[22].Strate- giessuchasdataaugmentationcanbeconsideredtoincreasethe numberoftrainingexamplestobesufficientforDNNs.Moreover, halfofthestudies[22]usedbetween8and62EEGchannels.Adding morechannelstoFz,CzandPzcouldincreasespatialresolution andaccuracy butwouldalsoincrease preparationtime andthe participant’sdiscomfort.Infuturework,moreontheeffectofnum- berof EEGchannelsontheP300classification accuracycanbe investigated.Furthermore,softorhardthresholdingbasedondis- cretewavelettransformcanbeconsideredfornoisecancellation [1].Anotherlineofresearchwouldbetoproposedifferentdeep learningmodelsfor thesameclassificationtask, withextensive parametergridsearch,orgeneticalgorithms.Basedontherecent reviewofthefield[29],frequentlycitedandpromisingnetworks include RecurrentNeural Networks,especially Longshort-term memory(LSTM).Moreover,aCNNlayertocapturespatialpatterns canbefollowedbyaLSTMlayerfortemporalfeatureextraction [29].

5. Conclusion

TheaimsofthepresentedexperimentsweretocompareCNN withbaselineclassifiers(LDA, SVM)usinga largemulti-subject P300 dataset. CNN was applied to raw ERP epochs (with the dimensionalityof3×1200).Baselineclassifierswereappliedto windowedmeansfeatures(withthedimensionalityof60).Empir- icalparameteroptimizationwasperformedusingcross-validation andclassifiersweretestedonaholdoutset.VariousCNNparame- tersarediscussed.Singletrialclassificationaccuracywasbetween 62%and64%foralltestedmodelswithCNNabletomatchbutnot outperformitscompetitors.Whenthetrainedmodelswereapplied toaveragedtrialsinthetestingphase, accuracyincreasedupto 76–79%.Achieved accuracy is comparable withstate-of-the-art despiteusinga multi-subjectdatasetfrom250children.Poten- tialexplanationoftheresultsarediscussed.Basedontheresults, LDAandSVMwithstate-of-the-artfeatureextractionstillseemto beagoodchoiceforP300classification,especiallywithrelatively smalltrainingdatasets.CNNmightneedmorespatialinformation inthedata(bymeansofmorechannels)tobetterunderstandthe patterns.Alternatively,thedatasetwasnotlargeenoughforCNN toproveitsbenefitsand,e.g.dataaugmentationtechniquescould helptoovercomethis obstacle.Boththepreprocesseddata[20]

andPythoncodes[26]areavailabletoensurereproducibilityofthe experiments.

Authors’contribution

LVdesignedandperformedthemachinelearningworkflow.LV wrotethemanuscript.

Acknowledgement

ThispublicationwassupportedbytheprojectLO1506ofthe CzechMinistryofEducation,YouthandSportsundertheprogram NPUI.

(7)

Conflictofinterest

Nonedeclared.

References

[1]M.Ahmadi,R.QuianQuiroga,Automaticdenoisingofsingle-trialevoked potentials,NeuroImage66(2013Feb)672–680.

[2]Y.Bengio,A.C.Courville,P.Vincent,UnsupervisedFeatureLearningandDeep Learning:AReviewandNewPerspectives,2012arXiv:1206.5538.

[3]B.Blankertz,S.Lemm,M.Treder,S.Haufe,K.Muller,Single-trialanalysisand classificationofERPcomponents-atutorial,NeuroImage56(2)(2011) 814–825,http://dx.doi.org/10.1016/j.neuroimage.2010.06.048.

[4]C.C.Chang,C.J.Lin,LIBSVM:alibraryforsupportvectormachines,ACMTrans.

Intell.Syst.Technol.2(2011),27:1–27:27,SoftwareAvailableathttp://www.

csie.ntu.edu.tw/cjlin/libsvm.

[5]F.Chollet,etal.,Keras,2015https://keras.io.

[6]D.A.Clevert,T.Unterthiner,S.Hochreiter,FastandAccurateDeepNetwork LearningbyExponentialLinearUnits(ELUs),2016arXiv:1511.07289.

[7]L.Deng,D.Yu,Deeplearning:Methodsandapplications,Found.Trends(r) SignalProcess.7(3-4)(2014)197–387,http://dx.doi.org/10.1561/

2000000039.

[8]R.E.Fan,K.W.Chang,C.J.Hsieh,X.R.Wang,C.J.Lin,Liblinear:alibraryforlarge linearclassification,J.Mach.Learn.Res.9(2008Jun)1871–1874http://dl.

acm.org/citation.cfm?id=1390681.1442794.

[9]N.Haghighatpanah,R.Amirfattahi,V.Abootalebi,B.Nazari,Asingle channel-singletrialP300detectionalgorithm.,201321stIranianConference onElectricalEngineering(ICEE)(2013)1–5,http://dx.doi.org/10.1109/

IranianCEE.2013.6599576,May.

[10]M.Hossin,M.Sulaiman,Areviewonevaluationmetricsfordataclassification evaluations,Int.J.DataMiningKnowl.Manag.Process5(2)(2015)1.

[11]D.P.Kingma,J.Ba,Adam:AMethodforStochasticOptimization,2014, Comment:PublishedasaConferencePaperatthe3rdInternational ConferenceforLearningRepresentations,SanDiego,2015arxiv:1412.6980.

[12]D.J.Krusienski,E.W.Sellers,F.Cabestaing,S.Bayoudh,D.J.McFarland,T.M.

Vaughan,J.R.Wolpaw,AcomparisonofclassificationtechniquesfortheP300 speller,J.NeuralEng.3(4)(2006)299–305,http://dx.doi.org/10.1088/1741- 2560/3/4/007.

[13]O.Ledoit,M.Wolf,Honey,ishrunkthesamplecovariancematrix,J.Portf.

Manag.30(4)(2004)110–119,http://dx.doi.org/10.3905/jpm.2004.110.

[14]F.Lotte,L.Bougrain,A.Cichocki,M.Clerc,M.Congedo,A.Rakotomamonjy,F.

Yger,AreviewofclassificationalgorithmsforEEG-basedbrain-computer interfaces:a10yearupdate,J.NeuralEng.15(3)(2018)031005,http://dx.

doi.org/10.1088/1741-2552/aab2f2.

[15]S.J.Luck,AnIntroductiontotheEvent-RelatedPotentialTechnique,MITPress, Cambridge,MA,2005.

[16]N.V.Manyakov,N.Chumerin,A.Combaz,M.M.VanHulle,Comparisonof classificationmethodsforP300brain-computerinterfaceondisabled subjects,Intell.Neurosci.(2011),http://dx.doi.org/10.1155/2011/519868, 2:1–2:12.

[17]D.J.McFarland,W.A.Sarnacki,G.Townsend,T.Vaughan,J.R.Wolpaw,The P300-basedbrain-computerinterface(BCI):effectsofstimulusrate,Clin.

Neurophysiol.122(4)(2011)731–737.

[18]D.J.McFarland,J.R.Wolpaw,Brain-computerinterfacesforcommunication andcontrol,Commun.ACM54(5)(2011)60–66,http://dx.doi.org/10.1145/

1941487.1941506,May.

[19]R.Mouˇcek,L.Vaˇreka,T.Prokop,J. ˇStˇebeták,P.Br ˚uha,Event-relatedpotential datafromaguessthenumberbrain-computerinterfaceexperimentonschool children,Sci.Data4(2017).

[20]R.Mouˇcek,L.Vaˇreka,T.Prokop,J. ˇStˇebeták,P.Br ˚uha,ReplicationDatafor:

EvaluationofConvolutionalNeuralNetworksUsingALargeMulti-Subject P300Dataset,2019,http://dx.doi.org/10.7910/DVN/G9RRLN.

[21]A.Pinegger,G.Müller-Putz,Notraining,sameperformance!?ageneric P300classifierapproach,Proceedingsofthe7thInternationalBCIConference Graz2017(2017),http://dx.doi.org/10.3217/978-3-85125-533-1-77,9.

[22]Y.Roy,H.J.Banville,I.Albuquerque,A.Gramfort,T.H.Falk,J.Faubert,Deep Learning-BasedElectroencephalographyAnalysis:ASystematicReview,2019 arXiv:1901.05498.

[23]R.T.Schirrmeister,J.T.Springenberg,L.D.J.Fiederer,M.Glasstetter,K.

Eggensperger,M.Tangermann,F.Hutter,W.Burgard,T.Ball,DeepLearning WithConvolutionalNeuralNetworksforBrainMappingandDecodingof Movement-RelatedInformationfromtheHumanEEG,2017

arXiv:1703.05051.

[24]N.Sharma,Single-TrialP300ClassificationUsingPCAWithLDA,QDAand NeuralNetworks,2017arXiv:1712.01977.

[25]D.S.Tan,A.Nijholt,Brain-ComputerInterfaces:ApplyingOurMindsto Human-ComputerInteraction,1sted.,SpringerPublishingCompany, Incorporated,2010.

[26]L.Vaˇreka,CNNforGTN,2019https://bitbucket.org/lvareka/cnnforgtn/src/

master/.

[27]L.Vaˇreka,T.Prokop,P.Mautner,R.Mouˇcek,J. ˇStˇebeták,Applicationofstacked autoencoderstoP300experimentaldata,Proceedingsofthe16th

InternationalConferenceonArtificialIntelligenceandSoftComputing,ICAISC 2017(2017).

[28]M.D.Zeiler,R.Fergus,Visualizingandunderstandingconvolutionalnetworks, 2013arXiv:1311.2901.

[29]X.Zhang,L.Yao,X.Wang,J.Monaghan,D.McAlpine,ASurveyonDeep LearningBasedBrainComputerInterface:RecentAdvancesandNew Frontiers,2019arXiv:1905.04149.

Odkazy

Související dokumenty

Keywords convolutional neural networks, neural networks, data prepro- cessing, intracranial hemorrhage, CT images, defects detection, classification, machine learning... 2.2

They called their network Deep Convolutional Neural Networks because of using a lot of lay- ers with 60 million parameters, and the modern term deep learning is now a synonym for

In this work, we investigate how the classification error of deep convolutional neural networks (CNNs) used for image verification depends on transformations between two

The text of the work is focused on description of the convolutional neural networks, and does not provide much details about the usefulness of the dense 3D representation for

To our best knowledge, there is no published work combining convolutional neural networks and multiple instance learning for the task of multiple myeloma diagnosis from

machine learning, artificial neural network, deep neural network, convolutional neural networks, indirect encoding, edge encoding, evolutionary algorithm, genetic

This chapter is providing brief introduction of different types of Artificial Neural Networks (ANNs) such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)

Keywords convolutional neural networks, recurrent neural networks, long short-term memory neural networks, deep learning, hyperparameters optim- isation, grid search, random