Review of Machine - Deep Learning in An Artistic Context - Machine Intelligence Report - Medium

08/09/2016 Reviewofmachine/deeplearninginanartisticcontextMachineIntelligenceReportMedium
Introduction
Machinelearningisafieldofartificialintelligencethatinvestigateshowalgorithms
canlearnfromobservationsanddata,asopposedtohumansexplicitlyprogramming
ineachstepofwhatasoftwareshoulddo.Thesealgorithmsenablecomputerstofind
anddeducecomplexrelationshipsandpatternsindata,andtheyproduceoutputsor
decisionsbasedonstatisticalmodels.Thefieldhasremainedprimarilyacademicfor
decades.Howeverwithrecentdevelopments,especiallyindeeplearning,andincreases
incomputingpower,machinelearningisstartingtoappearinoureverydaylives.
Thesealgorithmsarenowinourpocketspoweringspeechrecognition[Hintonetal.
2012],inouremailclientsfilteringspam[GuzellaandCaminhas2009],theyare
captioningimages[KarpathyandFeiFei2015],translatingtext[Sutskeverand
Vinyals2014]anddrivingcars[Thrunetal.2006].
Fortraditionalshallowmachinelearningtoworkoptimally,highdimensionalcomplex
dataneedstobemanuallyanalyzedanddomainspecificlowdimensionalfeatures
engineered[LeCun2012].Thismakestheminefficientformanyrealworldproblems.
Deeplearningalgorithmshowever,canlearnwhichfeaturestoextract,eliminatingthe
featureengineeringphase,allowingthemtooperateonhighdimensional,more
complex,realworlddata.Thesequalitieshavemadethemverypopularinrecent
years,especiallyConvolutionalNeuralNetworks(CNN)andRecurrentNeural
Networks(RNN).
Algorithmicallygeneratingimagesandsoundisawellestablishedresearcharea.
However,therearerecentdevelopmentsinthefieldusingdeeplearningthathave
producedveryuniqueresults[Nguyenetal.2015][Gatysetal.2015][Mordvintsevet
al.2015][Sturm2015].Unfortunately,thesedeeplearningcontentgeneration
techniquescurrentlycannotrunrealtimeorinteractively.Otherrecentresearchhas
combineddeeplearningwithreinforcementlearningtoallowadaptiveonlinelearning.
Thishasbeendemonstratedwithasystemlearningtoplayvideogamessimplyby
watchingthescreen,withnopriorknowledgeofthegame[Mnihetal.2013].
https://medium.com/machineintelligencereport/machinedeeplearninginanartisticcontext441f28774bcc#.3ek6nmkoq 1/30
Furthermore,thereareconvergingtrendsintheindustrybetweengaming,general
entertainmentandmediaconsumptionatrendwhichisinspiredanddriven
primarilybyantidisciplinaryartists.Majorfilmfestivalsaroundtheworldsuchas
Sundance,Tribeca,andTorontoareexploringandpromotinginteractivestorytelling,
nonlinearnarrativeandtransmediaexperiences.Productlaunchesarenow
accompaniedbyanalmostcompulsoryimmersiveinteractiveexperience.Similar
developmentsarebeingseeninmusic,danceandtheatre.
Technologythatwasonceconfinedtoacademiaandresearchlabsarebecoming
mainstreamconsumerdevices,anddrivingnewmarkets.InspiredbyCAVElike
immersiveenvironments[CruzNeiraetal.1993],ourown2011work[Aktenetal.
2011]usessixconsumerprojectorsandthreeSonyPlayStation3swithPSMove
controllerstoprojectionmapalivingroomwithdynamiccontentreactingtoits
inhabitant.TwoyearslaterMicrosoftResearchsIllumiRoom[Jonesetal.2013]
followedbyRoomAlive[Jonesetal.2014]demonstratestheirinterestinbringingthis
technologytoeveryoneslivingroom.LikewiseMITMediaLabsSixthSense[Mistry
andMaes2009]investigatesbringingaugmentedmixedrealityonasmaller,personal,
portablescale.NintendosWiicontroller,SonysPSMovecontrollerandMicrosofts
Kinectdepthcameraallbroughtalternateinteractionparadigmstothelivingroom.
Virtualrealityfordecadesonlyfoundinacademiaormilitaryuseisnowmakingits
wayintothelivingroomwithFacebooksrecentpurchaseofOculusRift,Googles
CardboardVR,andmanyothermainstreamtechnologybrands.EvenKelloggsare
makingacerealboxwhichcanbeusedasacardboardVRheadsetcompletewiththeir
owniOSandAndroidapp.Withinafewyears,withthelaunchofMicrosoftHoloLens
andGooglebackedMagicLeap,augmented&mixedrealityisalsolikelytobecomea
commonhouseholdexperience.
Wevealsoseentrendsinhardwareplatforms,expandingfromPCsanddedicated
gamingconsoles,tomobilephonesandtablets,andpotentiallytootherplatformssuch
asemergingsmartwatches,InternetOfThings,andevenconsumerquadrotordrones,
equippedwithprogrammableembeddedcomputers,potentialplatformsfor
augmentedgamesandactivities.
Throughpervasive,ubiquitoussmarthardwareandsoftware,allofthese
developmentsareleadingtothemainstreamadoptionofMultiModalMixedReality
ResponsiveEnvironments.Themostgroundbreakingworkdoneinthesefieldsare
oftendrivenbyreappropriatedtechnology,andmostfrequentlythepioneersofthis
disciplinebendingmisuseoftechnologyareartists.
Brief overview of MachineLearning

Machinelearningisafieldofartificialintelligencewhichinvestigateshowasystem
canimproveitsperformanceonataskwithrespecttoaspecificmeasure,basedonits
pastexperience[Mitchell1997].
Machinelearningalgorithmsfindcomplexrelationshipsindata.They
buildmodelsbasedonobservations,andlearnrulesrequiredtomakeoptimum
decisionsorpredictions.Theycanrecognizepatternsindatathathumansmaynotbe
abletorecognize,orevenifhumanscouldrecognizethosepatterns,theymightnotbe
consciouslyawareofhowtoformulateitinawaythatcouldbeprogrammedin
traditionalnonMLways.
Thisenablesusersofmachinelearningalgorithmstocreatesystemswhichexhibit
moreintricateandcomplicatedbehaviourthantheywouldbeabletoimplement
directly.Thisisaqualityofmachinelearningwhichisbothitsgreateststrength,but
alsoitsgreatestdangerasitcanlearnbiasesfoundindata,orintroducenewbiases
byoverfittingorunderfitting.ThesedangersareevenmoreamplifiedasML
algorithmsarerelativelydifficulttodebugandpeerinsideof.
Thefieldhasbeenstudiedformanydecades.Inhis1948essay[Turing1948],Alan
TuringdescribeshowmachinescouldbedesignedtolearnusingwhathenamedB
typeunorganisedmachines,conceptualprecursorstomoderndayartificialneural
networks.Formanyyearsmachinelearningremainedprimarilyanacademicresearch
area.Inmanycasesitdidnotseemainstreamuseduetohighcomputational
requirementsandinferiorperformancecomparedtootherAImethods[LeCun2014].
Howeveradvancesinmachinelearningalgorithms,andincreasesincomputingpower
specificallyhighlyparalleledGPUcomputinghasenableddramaticadvancements
inhowmachinelearningcanbeapplied[Ciresanetal.2011].Graduallymachine
learninghasoutperformedotherAItechniquesinfieldssuchasspeechrecognition
[Hintonetal.2012],naturallanguageprocessing[Collobertetal.2011],computer
vision[Couprieetal.2013],emailspamfiltering[GuzellaandCaminhas2009],image
captioning[KarpathyandFeiFei2015],roboticsandselfdrivingcars[Thrunetal.
2006].
Thekeyaspectofmachinelearningissupplyingalearningalgorithmwithdata.The
MLalgorithmanalysesthedataandconstructsamodel.Asnewinputdatais
presentedtotheMLalgorithm,itisfedintothemodel,themodelprocessestheinput,
andoutputsadecisionorprediction.
SupervisedLearning
InSupervisedLearning,themodelistrainedondatawhichcontainsmetadatasuch
aslabelsforclassificationorndimensionaloutputvectorswhichcanbemappedto
themdimensionalinputvectorsforregression.Oncetrained,themodelcanpredict
theclassificationorregressionforanewinputvector.Thismetadataneedstobe
assignedtoeachtrainingdataitemasinputoutputpairsmanuallybyaperson.This
makessupervisedlearningalgorithmsverypowerful,buttheycanalsobecumbersome
andtimeconsumingtoprepare.OnlinecrowdsourcingplatformssuchasMechanical
Turkhavehelpedacceleratethepreparationoflargelabelleddatasets.
UnsupervisedLearning
InUnsupervisedLearning,themodelisbuiltwithnometadata,noadditional
informationsuppliedbyaperson.Thealgorithmbuildsamodelbasedonitsown
observations.Itanalysesthedatatoextractrelationships,associationsandfeatures.
Whennewdataispresented,themodelcanpredicthowthenewdatarelatestoanyof
theexistingdatabasedonthepatternsithasalreadyfound.Thesealgorithmsare
usefulforclustering,andlearningaboutfeatures.
Semi-supervisedLearning
AcombinationofthetwoSemisupervisedLearningisusedwhensomeofthedata
ismanuallylabelledwithmetadata,andsomeofitisnt.Upontrainingthemodel
learnshowtoclassifythedata,andassociatetheunlabelleditemswiththecorrect
labelssuppliedinthetraining.
ReinforcementLearning
InReinforcementLearning,themodelisnottrainedwithmetadataassociatinginputs
withoutputs.Instead,foreverydecision(oractionasitscalledinRL)themodel
(oragentasitscalledinRL)receivesarewardbasedonthesuccessoftheaction.The
algorithmtriestolearnwhattheoptimaldecisionsarebymaximisingitslongterm
reward.Therewardreceivedfortakinganactionmightnotbeanimmediatereward
forthelastaction,butcouldbeadelayedrewardforanactionorseriesofactions
takenmuchearlieron.Partofreinforcementlearningistosolvethisattribution
problemofdelayedrewards.Thislearningprocessalsoinvolvesabalance
betweenexploration(ofnewactionswhichhaventyetbeenmade)vsexploitation(of
actionswhichareknowntorewardhigherthanothers).RLislearningbytrialand
error.
Deep Learning and motivations

DeepLearning(DL)isatechniquethataimstominimizeoreliminatedomainspecific
featureengineering[Guoetal.2014].
Traditionalmachinelearningtechniquessuchassupportvectormachinesorshallow
neuralnetworksareunabletoworkwithhighlycomplex,highdimensionaldata.
Whenworkingwithsuchmodels,itisnecessarytopreprocessthedata,reduce
dimensionsandextracthandcrafteddomainspecificfeatures,aprocesscalledfeature
engineering[LeCun2012].Thesefeaturesarecalledarepresentation(ofthedata).
Themachinelearningmodelisthentrainedonthisrepresentation.Theprocessof
featureengineeringisoftenquitedifficult,timeconsumingandrequiresskill[Ng
2013].Furthermore,theperformanceofthemodelishighlydependentonthe
representation[Bengioetal.2013].Asaresult,featureengineeringapproachescan
provideinconsistent,unreliableresults.
Usingdeeplearningtechniques,thepreprocessingandfeatureextractionstepscanbe
skipped.Instead,themodelisfedthehigherdimensional,rawdata.Thedeeplearning
modelisastackofparameterised,nonlinearfeaturetransformationsthatcanbe
usedtolearnhierarchicalrepresentations[LeCun2014].Duringtraining,each
layerlearnswhichtransformationtoapplyi.e.itlearnswhichfeaturetostore.Asa
result,thedeeplearningmodelstoresahierarchyoffeatureswithanincreasinglevel
ofabstraction.
Thismakesdeeplearningverypowerfulinhandlingrealworlddata.Howeverwith
manymoreparameterstolearn,ofteninthemillions,thiscomesatthecostofmore
compleximplementationsandrequirementsforhighercomputationalpowerand
largertrainingsets[LeCunetal.1998].
Insummary:Amachinelearningmodelrequiresarepresentationoftheworld.
Inshallowlearning,thisrepresentationneedstobehandcraftedusingdomain
specificpreprocessingandfeatureextraction.Indeeplearning,therawdatacanbe
directlyfedintothedeepmodelandthealgorithmslearnahierarchyof
representations.
However,eventhoughfordeeplearningdomainspecificfeatureengineeringisnot
necessary,representationoftheinputdataisstillimportant.Akeycomponentin
implementingasuccessfuldeeplearningsystem,istherelationshipbetween
therepresentationandarchitectureofthemodel,theyneedtobecompatiblesothat
therelevantfeaturescanbeextractedefficientlyfromtheinputs[Bengioetal.2013].
Intheseminalresearch[Krizhevskyetal.2012],theauthorsfoundthatremovingany
oneoftheirconvolutionallayerseachofwhichcontainednomorethan1%oftheir
60millionparametersresultedininferiorperformance.
WhileAIresearchersareveryawareoftheselimitationsthisdomainspecific
dependencyonthearchitectureofthemodelthisputscurrentdeeplearningmodels
farfromthesilverbullet/universallearningalgorithmthatthemedia
hypesometimesproposesdeeplearningtobe.
Very brief history of DeepLearning

Anindepthsurveyofdeeplearningandrelatedalgorithmsisoutofscopeofthistext
andIllonlyfocusonrecentmajormilestonesrelevanttothisresearch.Foranin
depthsurveypleasesee[Schmidhuber2014](Update:Schmidhuberssurveygoes
throughalotofdetailandhistoryofthedevelopmentofthealgorithmsand
breakthroughsinmachine/deeplearning.KyleMcdonaldrecentlypointedme
tothisarticlebySchmidhuberwhichbrieflysummarizessomeofthesedetails).
In[LeCunetal.1989]YannLeCunusedaConvolutionalNeuralNetwork(CNN)a
deeplearningneuralnetworkwithmanylayersandconnectionsinspiredbythose
foundinbiologicalsystemstorecognisehandwrittendigits.Howeverthecomputers
oftheerawerenotpowerfulenoughtoruntheoperationsrequiredbythenetwork,so
anadditionalDSPboardwasneeded.OverthenexttwentyyearsLeCunsresearch
showedthatconvolutionalnetworkshadtheabilitytolearnandrecognisepatternsin
imageandspeechrecognition[LeCunandBengio1995][LeCunetal.1995][LeCunet
al.1998][LeCunetal.2004].
However,deeplearningalgorithmsrequireagreatdealofcomputingpower.
Especiallywhendealingwithlargeinputssuchasimages,inwhichcasetheamountof
computationscaleslinearlywiththenumberofpixels[Mnihetal.2014].CNNswere
notpracticalforrealworldapplicationsuntilhighlyparallelGraphicalProcessing
Units(GPU)becameavailable[Rainaetal.2009].Thisledtoanexplosionofdeep
learningimplementations.Quitefamously,QuocLeetaldevelopedsoftwarethat
learnedhowtodetectcatsandextracttheirfeaturesbysamplingrandomframesfrom
10millionYouTubevideos[Leetal.2011].
Withsomanyparameterstotrain,convolutionalnetworksrequiremassivedatasets.
Intheabsenceofsuchdata,convolutionalnetworkswerebeingoutperformedby
older,handcrafted,specialistpatternrecognitionalgorithms.Withtheintroductionof
largedatasetswithmillionsoflabeledimagesinthousandsofcategories,suchas
ImageNet[Dengetal.2009],thisbalanceshifted.In2012GeoffreyHintonsstudents
AlexKrizhevskyandIlyaSutskeverdesignedanewdeepCNNarchitecturewith60
millionparametersandtraineditacrosstwoGPUs[Krizhevskyetal.2012].Their
modeloutperformedtraditionalimagerecognitionmethodsbyalargemargin.This
wasaturningpointthatledtomanyimprovementsoverrecentyears,dramatically
decreasingerrorsinpredictions.
AsimilarpatterncanbeseenintheadoptionofRecurrentNeuralNetworks(RNNs)
deepneuralnetworkswithfeedbackloopconnections,abletostoreinternalstates,
allowingthemtoprocesssequential,temporaldata.Inthefieldofspeechrecognition,
handcraftedGaussianMixtureModelHiddenMarkovModelapproacheswere
consistentlyoutperformingothermethods,includingdeeplearning.Oncelarge
datasetsandcomputingpowerbecameavailable,recurrentneuralnetworksstartedto
outperformGMMHMMandbecomemorewidespreadinspeechrecognition[Hinton
etal.2012][Dengetal.2013].
Asdescribedabove,asignificantaspectofthesuccessfulapplicationofDeepLearning
involvestherecognitionofcomplexpatterns.
Recentresearchhasdemonstratedthatdeeplearningapproachesareusefulbeyond
classificationandrecognitiontasks.In[IlyaSutskever,OriolVinyals2014]theauthors
useddeeplearningtotranslatetext.Thisinvolvedtheconsiderablechallengeof
dealingwithinputandoutputsequencesofvariablelength.TheauthorsusedLong
ShortTermMemory(LSTM)[HochreiterandSchmidhuber1997]RecurrentNeural
Networks,andtheirapproachisknowntooutperformpreviouslyknownmethodsfor
solvingsuchproblems.In[Graves2013],AlexGravesdemonstratedusingLSTM
networkstogenerateavarietyofsequentialoutputsincludinglongchunksoftextand
handwriting.
Furtherresearchinthefieldhasshownpromisewithrespecttousesofdeeplearning
withartistic,creativeoutput.
Deep Learning for Artistic & CreativeOutput

(NB:thissectionisalreadyveryoutofdateandcouldbe10xlonger!)
In[Erhanetal.2009],theauthorswerecuriousaboutmethodsofqualitativeanalysis
ofnetworkarchitectures,particularlytheeffectsofvaryinginputsonspecificneuron
activityinhiddenlayersofdeepmodels.Theyachievedthisbyusinggradientascent
onimageclassificationdeepmodels.Withthismethodtheywereabletogenerate
inputswhichmaximisedneuronactivityonthelayersofinterestinStackedDenoising
AutoencodersandDeepBeliefNetworks.
In2013ResearchersatOxfordUniversity[Simonyanetal.2013]usedasimilar
gradientascentmethodtogenerateimagesthatmaximiseaclassscoreina
convolutionalnetworktrainedonImageNet.Theyusedasimilartechniquetoalso
generateaclasssaliencymap,givenaninputimageandaclass.
Thefollowingyear,OxfordUniversityresearchers[MahendranandVedaldi2014]
werefrustratedatthelackofknowledgeoftheinternalsofimagerepresentationused
inconvolutionalnetworkstrainedforimageclassification.Theyappliedgradient
ascenttoinvertaCNN,reconstructingimagesforthevariousdifferenthiddenlayers
andneurons.TheyfoundthataCNNtrainedonimages,suchasImageNet,stored
photographicinformationonsomelayers,andothermoreabstractfeaturessuchas
edges,shapesandgradients.Thevisualoutputsofthesemethodsincludeabstract
butrecognisablerepresentationsofthetrainedimagesandclasses.
In2015,AlexeyDosovitskiyandThomasBroxdevisedanewmethodofinverting
CNNstovisualisetheinternalrepresentationsofaconvolutionalneuralnetworkby
usingasecondconvolutionalnetwork[DosovitskiyandBrox2015].
Alsoin2015,AnhNguyenetalwerecurioushowrecognitionindeepimage
classificationmodelsdifferedtovisualrecognitioninhumans.Theyused
convolutionalnetworkstrainedonImageNetandtheMNISThandwritingdataset
[LeCunetal.],combinedwithevolutionaryalgorithmsandgradientascenttogenerate
imagesthatscoredhighlyforspecificclasses,butthatwereunrecognisabletohumans
[Nguyenetal.2015].Theyfoundthatinsomecasestheycouldgenerateimagesthat
wouldsatisfyparticularclasseswith99.99%accuracybytheCNN,butbecompletely
unrecognisablebyhumans(e.g.detectingacheetahinwhitenoise,astarfishinwavy
linesetc.).Interestingly,theysubmittedtheoutputimagestoanartcontest,andwere
amongthe21.3%ofthesubmittedartworksselectedforexhibition.
Againin2015Googleresearchers[Mordvintsevetal.2015]releasedcodefora
researchtheycalled#DeepDream#Inceptionism,whichwentviralonsocialmedia.
Theyusedsimilargradientascenttogenerateimagesthatmaximisedactivityon
particularhiddenneurons,butthenfedthegeneratedoutputbackintotheinputto
createfeedbackloopsthatactedtoamplifyactivity.Combinedwithimage
transformationsoneveryiteration,thiscreatedendlessfractallikeanimationsand
hallucinationsofabstractbutsubtlyrecognisableimagery.Theyusedtheirown
GoogLeNetconvolutionalnetworkarchitectureforthisresearch,detailsofwhichcan
befoundin[Szegedyetal.2014].
Alsoin2015[Gatysetal.2015]releasedsimilarresearchthatwasalsohighlyshared
onsocialnetworks,called#StyleNet.Thisresearchextractstheartisticstyleofan
imageforexample,apaintingbyVanGoghorEdvardMunch,andappliesitto
anotherimage,suchasaphotograph.Techniquesofapplyingartisticstylestoimages
havebeenresearchedformanyyearsasasubsetofnonphotorealisticrendering,a
2013surveycanbefoundin[Kyprianidisetal.2013].Howeverinmostcasesthe
algorithmsarehandcraftedtoresembleeachparticularstyle(anexceptiontothiscan
beseenin[Mitaletal.2013]).InGatysetal.sresearch,theauthorsfoundthatthey
coulduseconvolutionalneuralnetworkstoseparatethecontentandthestyleofthe
image,storingdifferentrepresentationsforeach.Doingsoenabledthemtoapply
differenttransformations,orevenmixandmatchrepresentationsfromdifferent
imagese.g.applyingthestyleofVanGoghsStarryNighttoaphotograph.This
techniqueworksremarkablywellevenonveryabstractstylessuchasMarkRothko,
JacksonPollock,orPietMondrian.
Similardevelopmentshavealsotakenplacewithsequenceddatausingrecurrent
neuralnetworks.In2015AndrejKarpathyreleasedanopensourceRecurrentNeural
Networkimplementationfortrainingoncharacterlevellanguagemodelscalledchar
rnn[Karpathy2015a]basedon[Graves2013].Thesoftwaretakesasingletextfile,and
generatessimilartextbasedoncharactersequenceprobabilities.Thissoftwarehas
beenusedbyanumberofpeopletogeneratetextinthestyleofShakespeare,cooking
recipes,raplyrics,Obamaspeeches,thebibleandmore[Karpathy2015b].Thischar
rnnlibrarywasalsousedby[Sturm2015]togeneratemidinotesinthestyleoffolk
music.Duetoitbeingasimpleimplementationusingatextsequence,itislimitedto
monophonicoutput.Otherexamplesofcomposingmusicusingrecurrentnetworks
include[EckandSchmidhuber2002][BoulangerLewandowskietal.2012].
Inaddition,in2013,DeepMindtechnologies(recentlyacquiredbyGoogle)developed
asystemwhichiscapableoflearninghowtoplayAtarigamessimplybyobservingthe
imagesonthescreen[Mnihetal.2013].Givennopriorknowledgeofthegamerulesor
controls,withonlythescreenpixelsasinput,thesystemdevelopedstrategieswithina
fewdaysofplaying.InsomecasesthesestrategiesoutperformedotherAIs,andin
othercasesevenoutperformedhumanplayers.Theyachievedthisbyimplementinga
deepreinforcementlearningalgorithmtrainingconvolutionalnetworkswithQ
learning,anapproachtheycallDeepQNetworks(DQN).Inanotherrecentresearch
[Guoetal.2014],theauthorsinvestigatedmethodsofimprovingonDQNs
performanceusingMonteCarloTreeSearchmethods[BrowneandPowley2012].Not
havingaccesstotheinternalgamestate,theresearcherslooktotrainaCNNwith
offlineUCTusingthescreenpixelsasinput,andthenusethetrainedCNNatruntime
asapolicytoselectactions.TheyfoundthattheUCTtrainedCNNiscurrentlystateof
theartrealtimeAIandbeatthescoreoftheDQN.Howeverinordertoachievethose
resultstherewasaverylongofflinetrainingperiod.
Theserecentdevelopmentsusingdeeplearningnetworkstogenerateimages,sounds
oractionsshowincrediblepotentialfortheapplicationofdeeplearningforcreative
output.Therearestillmanyunexploredterritorieshowever,andtheperformanceis
farfromrealtime,andnotinteractive.
Brief History of Algorithmic Computational Art (in

relation toML/AI)
InthissectionItakeaslightdigressiontoacknowledgealgorithmicartpriortoDeep
Learning.Anindepthsurveyisoutofscopeforthisreview,butthisservesasabrief
summaryofthearea.Amorecomprehensivereviewcanbefoundin[Grierson2005]
and[Levin2000].
Theuseofcomputersforthepurposesofmakingartdatesbackatleastasfarasthe
1950sand1960s,mostnotablywithJohnWhitneysDIYanalogcomputersbuiltfrom
WorldWarIIM5andM7targetingcomputersandantiaircraftmachinery[Alves
2005].Whitneysworkswerenotonlypioneeringtechnically,leadingtothebirthof
computergraphicsandspecialeffectsinscenessuchasStanleyKubricks2001:A
SpaceOdyssey,butinadditiontheypioneeredthefieldofcomputeraidedaudiovisual
composition[Grierson2005].Hisworkcontinuesinthetraditionofexperimental
abstractanimatorsandfilmmakerssuchasNormalMcLarenandOskarFischinger.He
wasjoinedshortlyafterbysoftwareartistssuchasPaulBrown,VeraMolnar,Manfred
Mohr,FriederNake,LarryCubaandmanymore.
HoweveritwasHaroldCohensAARONsoftwarefrom1973[Cohen1973]whichfirst
introducedartificialintelligenceintocomputerart.AARONwasostensiblyapieceof
software,writtentounderstandcolourandform.Cohenoftentalksabouttraininghis
software[Cohen1994].Howeverheusesthetermrhetorically.Thelearningin
AARONisnotthemachinelearningmentionedinprevioussections.AARONdoesnot
learnbylookingatdata.Instead,wheneverCohenwantsAARONtolearnsomething
new,hehastoanalyseithimself,andimplementthesetsofrulesrequiredtoreplicate
thatbehaviour.OftentheseareverycomplexsetsofrulesthattakeyearsforCohen
himselftolearnbeforehecanprogramthem[Cohen2006].
Othercomputergraphicsartistsworkingwithartificialintelligenceatthetimeinclude
WilliamLatham[ToddandLatham1992],KarlSims[Sims1994],andScottDraves
[Draves2005].Theseartistsprimarilyexploredevolutionaryalgorithmsinthe
creationofalgorithmicart,alsoknownasevolutionaryart.Alsoduringthisperiod,
DavidCopedevelopedanalgorithmforcomposingmusic.HisExperimentsinMusical
Intelligence(EMI)beganin1981,andhedevelopeditoverthedecadesuntilhe
eventuallypatentedthealgorithmRecombinantmusiccompositionalgorithm[Cope
2010].Usingthisalgorithmhegeneratedmusicalsequencesinthestyleofmany
classicalcomposersandstyles,suchasBach,Vivaldi,Beethoven,Mozart,Chopinand
Debussy.HislatestsoftwareEmilyHowellhasalbumsbeingreleasedunderitsname.
Startinginthe1960sMyronKrugerdevelopedgesturallyinteractivecomputer
artworksandResponsiveEnvironments,culminatinginhisseminalArtificialReality
environmentVideoplace[Kruegeretal.1985].Videoplacetrackeduserswith
cameras,enablingthemtointeractwithvirtualobjectsinthesceneusingprojectors.
Firstdevelopedin1986,DavidRokebysVeryNervousSystemexploressimilar
themesofgesturalfullbodyinteractionusinghandbuiltcamerasinthiscaseto
generatemusic[Rokeby1986].Othernotableartistsworkingwithsimilarideasinthis
eraincludeEdTannenbaum,ScottSnibbe,MichaelNaimark,GolanLevinandCamille
Utterback.
Withtheintroductionofcreativecodingtoolsandopensourcecommunitiestherehas
beenexponentialgrowthinthisfieldoverthelastdecades.Someofthesetoolswhich
haveglobalcommunitiesincludeProcessing,openFrameworks,Cinder,vvvv,
Max/MSP/Jitter,PureData,SuperCollider,TouchDesigner,QuartzComposer,Three.js
andmanysmallerbespokeones.
Thisprocessdrivencreativeartformcanbetracedbacktoawiderrulebased
generativeartmovementthatincludescomposerssuchasSteveReich,JohnCage,
TerryRiley,BrianEnoandartistssuchasSolLewittandNamJunePaik.
Computational Creativity
AsubfieldofArtificialIntelligencewhichisrelatedtothisareaisComputational
Creativity.Whereasartificialintelligencequestionswhetheramachinecanthink,or
exhibitintelligentbehaviour[Turing1950],computationalcreativityquestions
whetheramachinecanbecreative,orexhibitcreativebehaviour.
ComputationalCreativityresearchisnotonlyconcernedwiththecreativeoutputof
thealgorithmsortechnicalimplementationdetails,butisequallyifnotmore
concernedwiththephilosophical,cognitive,psychologicalandsemanticconnotations
ofmachinesexhibitingcreativebehaviour,oractingcreative.In[McCormackand
dInverno2012]andrelatedpapers[McCormackandDInverno2014],McCormack
anddInvernoaskandattempttoanswerquestionsregardingcomputersand
creativity,creativeagencyandtheroleofcreativetools.
Aspartofthephilosophicalangleofcomputationalcreativityresearch,thereisoften
anemphasisonfullyautonomoussystems.Thisfieldincludesresearchintosoftware
whichexhibitsintentionalityandisabletojustifythedecisionsitmakeswhencreating
apieceofworkbyframinginformationinthecontextofthework[ColtonandWiggins
2012].Thiscanbethoughtofasanalogoustoanartistmakingdeliberate,purposeful
decisionsateverystepofthecreativeprocess.Thisfieldofcomputationalcreativityis
sometimesreferredtoasstrongcomputationalcreativity[AlrifaieandBishop2015]
analogoustoJohnSearlesstrong(vsweak)artificialintelligence[Searle1980].This
fieldisalsoaccompaniedbyformalisms,proposedmodelsandtheoriesofcreativityto
ensurethesystemsbehaviourcomplieswithwhatisthoughttobecreativebehaviour
[Coltonetal.2011].
Withinthiscontexttherehavebeenresearchintosystemsthatconceivefictional
concepts[Cavalloetal.2013],designvideogames[Cooketal.2014],writepoetry
[Coltonetal.2012],andotherinherentlycreativetasks.
NB:WhilethealgorithmictechniquesusedforcontentgenerationinComputational
Creativityiswithinthescopeofmyresearch,theformalismsandmodelsofcreativity
arenot.Itcanbesaidthatmyresearchisinterestedinweakcomputational
creativityparticularlysemiautonomous,collaborativecreativitywhere
humaninteractioninthecontentcreationprocessisnotonlyrelevant,itis
essentialtocreateinteractivesystemswherethehumanusercanguide
thecomputationallycreativesysteminrealtime.
Machine Learning for Artistic, Expressive Human

Computer Interaction(AEHCI)
Introduction
IntheprevioussectionsIreviewedanonexhaustiverangeofrelevantliteraturein
areasofalgorithmicimageandsoundgeneration,rangingfromsimplealgorithmsto
thelatestdevelopmentsindeeplearning.Someofthesehavebeennonrealtime,for
examplewherecontentisgeneratedthroughtheapplicationofdeeplearning,while
othershavebeenrealtime,eveninteractive.ThissectionwillcoverArtisticExpressive
HumanComputerInteraction(AEHCI)HumanComputerInteractionforartistic
expression.
In[Dourish2001]PaulDourishproposesnewmodelsforinteractivesystem
design.EmbodiedInteractionisinteractionembodiedintheenvironment,notjust
physically,butasafundamentalcomponentofthesetting.Itisaninteractiondesign
thattakesintoconsiderationthewaysweexperiencetheeverydayworld.This
philosophyisparticularlyapplicablewhendesigninggesturalinterfacesforartistic
expression.
Asmentionedbefore,MyronKrugerwasalsointerestedinexploringResponsive
Environmentsinwhichinteractionisacentral,notperipheralissue[Kruegeretal.
1985].Hesawpotentialinthisareaforthearts,education,telecommunicationsaswell
asgeneralhumanmachineinteractionandwasmotivatedbycreatingplayful
environmentswhichexploretheperceptualprocessweusetonavigatethephysical
world.
HumancomputerinteractionformusicorMusicianComputerInteraction(MCI)
[Gillian2011]isoneofthemoreacademicallyestablishedfieldsrelatedtoAEHCI,
moresothangesturalhumancomputerinteractionforvisualcomposition.However
manyoftherequirementsforinteractiondesignandparticularlygesturerecognition
aresimilar.Bothrequirelowlatency,realtimesystemsthatcanbeconfiguredonthe
fly.Theyneedtobecapableofdetectingawiderangeofgestures,someAEHCI
systemsmightconcentrateonsubtlefingermovements,whileotherstrackwhole
bodiesofmultiplepeople.Furthermore,theabilitytodetectandrespondtosubtle
variationsingesturesisessentialtoconveyexpressivity[Caramiauxetal.2014].Also,
inperformancesituations,gesturerecognitionneednotbegeneralizedacrossdifferent
people,buttrainingcanbespecifictotheperformingindividualtomaximisepersonal
expression[Gillian2011].
Duetothesesimilarities,inthisresearchMusicianComputerInteractionistakenasa
basemodelforexpressivegesturalinteraction,andwillbebuiltonforgeneralAEHCI.
Gestures
Asurveyofdefinitionsofgesture,especiallyinrelationtomusiccanbefoundin
[CadozandWanderley,2000].Theauthorsconcludethatthemanyproposed
definitionsdonotadapttogestureinmusic,buttheypurposefullyavoidprovidinga
newdefinitions,focusinginsteadonwhichaspectsofthevariousdefinitionsmight
apply.In[Camurrietal.,2004]theauthorsdefineExpressiveGestureasresponsible
of[sic]thecommunicationofinformationthatwecallexpressivecontentwhere
Expressivecontentconcernsaspectsrelatedtofeelings,moods,affect,intensityof
emotionalexperience.ThisisthedefinitionofExpressiveGesturethatisusedinthis
research,complementedwiththenatural,spontaneousgesturesmadewhenaperson
istellingastory,asdescribedin[CassellandMcneill,1991].Particularlythosewith
thesemioticclassififcationofmetaphoric,indicatingabstractideas[McNeillandLevy,
1980].Awiderstudyofgestureexpressivityanditsdimensionsespeciallyinthe
contextofmusicalperformanceandhumancomputerinteractioncanbefoundin
[Caramiaux,2015].
Thisresearchisnotconcernedwithdetectingemotionwithingestureasin[Cowieet
al.,2001]or[Zengetal.,2009].Insteaditisconcernedwithfindingcorrelations
betweenvariousparametersofagesture,andparametersofthegenerativeoutput
model.Itwillmapexpressivegesturetotrainedmodelsofartisticcontentsynthesis
andmanipulation.Inspiredbyresearchinembodiedcognitionandtherelationship
betweenactionandperception[Kohleretal.,2002,MetzingerandGallese,2003,
Leman,2007],in[Caramiauxetal.,2009]theauthorsinvestigatesimilar
relationshipsbyanalysingmotioncapturedataofparticipantsperformingfreehand
movementswhilelisteningtoshortsoundsamples.
Gesture RecognitionSensors
Oneofthesignificantchallengesinexecutinggesturalinteractionisreadingrelevant
informationfromtheuserrecognisingtheirpositions,movementsandgestures.A
furtherchallengeisextrapolatingtheirmotivationsandintentionsfromthose
gestures.
Therearemanyhardwaredevicesandsensorswhichcansupportthisprocess:
accelerometersandinertialmeasurementunits(IMU),myoelectricsensors,ultrasonic
andinfraredrangefinders,2dcameras/depthcameras/computervision(CV),
radar,lidaretc.Surveysofgesturerecognitiontechnologyandresearchcanbefound
in[Gillian2011]and[LaViolaJr.2013].
Myresearchdoesnotinvestigatenewmodesofsensing.Itfocusesonemerging
consumertechnology,andapplicationsofalgorithmicimageandsoundsynthesisand
manipulationwithinthatcontext.Thisistoremainapplicabletohardwareand
environmentsthatrelatetomainstreamuse.Primarilythiswillinvolvedepthcameras
similartoMicrosoftsKinect2andLeapMotion,aswellcomputervisionwith
traditional2dcameras.Howeverthereisanincreasingtrendinconsumerdevicesto
combinemultiplesensorsformorevarieddata,contributingtohigheraccuracyin
estimationsofposeandmovement.PastexamplesofthisareNintendosWiimote
controllercombiningaccelerometerwithinfraredsensorandanadditional
gyroscopewiththeMotionPlusaddonandSonysPSMovecontrollercombining6
axisIMUwithmagnetometerandhighspeedcomputervision.Bothcontrollersalso
featurebuttonsandDpadaswellasvibrationbasedhapticfeedback.Microsofts
KinectalsocombinesanRGBcamera,IRcamera/depthsensor,microphonearray,
andaccelerometer(todetectdeviceorientation)intoasingle,affordableconsumer
device.
Nextgenerationdevicesarecombiningincreasingnumbersofsensors.ProjectTango
byGooglesAdvancedTechnologyAndProjectsGroup(ATAP)[GoogleAdvanced
TechnologyAndProjects2014]isanexampleofthistrend.ProjectTangoisanext
generationmobiledevicewithadepthsensor,motiontrackingcameraand9axisIMU
enablingittocalculateitspositionandorientationinspace,whilesimultaneously
scanningandbuildinga3dmapofitsenvironment.Averyrecentresearchproject
fromadifferentteamatthesamegrouphaveannouncedProjectSoli[Google
AdvancedTechnologyAndProjects2015].ProjectSoliusesradartotrackhandand
fingermovementsatsubmillimetricprecisionandhighspeed,enablingnatural,
intuitiveinterfacesforsmallwearabledevices.
Suchdevicesshareacommonproblem:extractingmeaningfulinformationfromthe
dataforgesturerecognitionisachallengingtask.Currently,machinelearningisa
verypopularandsuccessfulresearchareainthisfield.
Gesture Recognition forAEHCI

Gesturerecognitionisaverybroadfield.Thissectionwillfocusongesturerecognition
forAEHCI.Widersurveyscanbefoundin[MitraandAcharya2007][Gillian2011]and
[LaViolaJr.2013].
In1992MichaelLeeetalusedaneuralnetworkinsidetheMAX/MSPmusical
programmingenvironmenttoinvestigateadaptiveuserinterfacesforrealtimemusical
performance.Theywereabletosuccessfullyrecognisegesturesfromanumberof
devicesincludingaradiobatonandacontinuousspacecontroller[Leeetal.1992].In
1993,SidneyFelsandGeoffreyHintonusedneuralnetworkstomaphandmovements
capturedviaadataglove,toaspeechsynthesiser[FelsandHinton1993].They
achievedrealtimeresultswithavocabularyof203gesturestowordsdemonstrating
thepotentialofneuralnetworksforadaptiveinterfaces.
ArtificialNeuralNetworksareparticularlyusefulforAEHCIastheyareableto
mapmdimensionalinputvectorstondimensionaloutputvectorswithalearned
nonlinearfunction,allowingthemtocontrolcomplexparametersetssimultaneously.
Thisisespeciallyusefulinregressiontaskswhenmanipulatingcontinuousparameters
ofagenerativevisualoraudiomodel.Manyothermachinelearningtechniqueshave
beenusedforgesturerecognition,withdifferentspecificusecases.TheseincludeK
NearestNeighbour,GaussianMixtureModels,RandomForests,AdaptiveNave
BayesClassifiersandSupportVectorMachinestoclassifystaticdataDynamicTime
WarpingandHiddenMarkovModelscanbeusedtoclassifytemporal
gesturesLinearRegression,LogisticRegressionandMultivariateLinear
Regressioncanbeusedforregressionasopposedtoclassifyingtheinput.Asurveyof
machinelearningtechniquesandapplicationsformusicalgesturerecognitioncanbe
foundin[CaramiauxandTanaka2013].
Asmentionedpreviously,inanartistic,performativecontext,detectingsubtle
variationsofgesturesisvitaltoconveyingexpressivity.In[BevilacquaandMuller
2005][Bevilacquaetal.2009],Bevilacquaetaldesigncontinuousgesturefollowers
thatallowtemporalgesturerecognitioninrealtimewhilethegestureisbeing
performed.Thisalgorithmreturnstimeprogressioninformationandlikelihood,
enablingperformerstoalterspeedandaccuracyofthegesturetocontrolparameters
oftheirgenerativemodel.
In[Caramiauxetal.2014][Caramiaux2015]Caramiauxetaldevelopsystemsthatgo
beyondclassificationofthegestures,tocharacterisethequalitiesofthegestures
execution.Theyusecomputationaladaptivemodelsfor
identifyingtemporal,geometricanddynamicvariationsonthetrainedgesture.
Returningthisinformationinrealtimetotheperformerastheyareexecutinggestures,
enablestheperformertomapthevariationstoparameterssuchastimestretching
samples,modulations,andvolumeorcustomsynthparameters.
In[Kiefer2014]KieferinvestigatestheuseofEchoState(RecurrentNeural)Networks
(ESN)asmappingtools,tolearnsequencesofinputgestures,andnonlinearlymap
themtomultiparameteroutputsequences.TheresearchconcludesthatESNs
demonstrategoodpotentialinpatternclassification,multiparametriccontrol,
explorativeandnonlinearmapping,butthereisroomforimprovementtoproduce
moreaccurateresultsinsomecases.
Interactive Machine Learning(IML)

Asdiscussedabove,machinelearningisaverysuccessfultechniqueforpatternand
gesturerecognition.Howeverusingmachinelearningcanbedifficultbecauseofthe
technicalknowledgeandtimerequiredinbuildingclassifiersandsettingupthesignal
processingpipeline[Failsetal.2003].
InteractiveMachineLearning(IML)isafieldwhichlooksattheprocessofusing
machinelearning,throughthelensofhumancomputerinteractionresearch[Fiebrink
2011].
WhileMLbringshugeadvancementstothefieldsofdataanalysisandpattern
recognition,IMLseekstoimprovehowMLsystemscanbeused.Particularly,
expandingitsuserbasefromdedicatedcomputerscientistsandcloselyrelated
disciplines,toamuchwideraudience.ThisismadepossibleviaaGraphicUser
Interface(GUI)frontendtoamachinelearningbackend,withdatastreamedliveto
andfromthemachinelearningbackend.Thetrainingandpredictionscanbe
performedinrealtime,withoutwritinganycodemakingitaperfectchoicefor
performanceandAEHCI.
RebeccaFiebrinketalsWekinatorsoftwarereleasedin2009isanIMLsystemaimed
atmusicalperformance[Fiebrinketal.2009].UsingaGUI,usersareabletosetup,
trainandmodifyparametersofamachinelearningalgorithm.Thesoftwarealsoallows
otherapplicationssuchasexistingmusicsoftware,visualsoftware,orothercustom
generativesoftwaretostreamdatatotheWekinatorusingaUDPbasedprotocol
commonlyusedininterappandinterdevicecommunicationscalledOpenSound
Control(OSC)[WrightandFreed1997].AstheWekinatorreceivesthisdata,itrunsit
throughamachinelearningmodelandstreamsbackpredictionsinrealtime.
Wekinatoralsohasanumberofbuiltinsensorinputandfeatureextraction
capabilitiessuchasedgedetectionfromawebcam.Usingthistool,artists,musicians,
dancers,performersandresearchersfromotherfieldscantrainandmapgesturesto
arbitraryoutputs,suchasnotes,effects,imagesandsoundswithnoprogrammingor
needforanyothercomputervisionsoftware.
NickGilliansGestureRecognitionToolkit(GRT)from2011[Gillian2011]provides
similarfunctionalitybutwithmoreemphasisonthesignalprocessing/gesture
recognitionpipeline.Itlacksbuiltininputfunctionalitysuchaswebcamor
microphoneinputs,buthasanumberofbuiltinpreprocessing,featureextraction
andpostprocessingalgorithms.ExamplesfortheseareFastFourierTransform,
PrincipalComponentAnalysis,variousfilters,derivatives,deadzonesandmore.In
additiontobeinganopensourceapplication,theunderlyingcodebaseisreleasedasa
C++frameworkallowingittobeintegratedintobespokeapplications.
RecentlyNVIDIAreleasedasimilarGUIbasedIMLapplicationDeepLearningGPU
TrainingSystem(DIGITS)allowingresearcherstousedeeplearninginasimilar
interactivefashion[NVIDIA2015].Thesoftwareusesthepopularopensourcedeep
learningframeworkCaffe[Jiaetal.2014],andisdesignedtotakefulladvantageof
GPUacceleration,scalingupautomaticallyinmultiGPUsystems.
In2015,duringmyresearchIrequiredaMultiModelIMLsystem.OneinwhichI
coulddynamicallycreateandtrainnewmodelswhileleavingexistingmodelsintact.I
alsoneededtobeabletoaccessmultiplemodelssimultaneously,feedingeachmodel
differentinputsandreceivingtheassociatedpredictions.ForthisIdeveloped
msaOscML[Akten2015a].ItisinspiredbyandsimilartoRebecca
FiebrinksWekinator[Fiebrinketal.,2009]andNickGilliansGRT[Gillian,2011].
Howeverwhilethosetoolsareaimedatanontechnicalaudiencethushaveauser
friendlyGraphicalUserInterface(GUI),msaOscMLiscurrentlyaimedatdevelopers
thushasnoGUI.Itrunsinthebackgroundasaserverwithonlyaconsoletoindicate
statusandprovidevisualfeedbacktotheuser.Itcanbeinteractedwith(inputand
output)viatheOpenSoundControl(OSC)protocol[WrightandFreed,1997].The
mainpurposeofmsaOscML,anddifferenceoverWekinatorandGRT,isthatitcan
dynamicallycreate,trainandmanagemultipleindependentmodelssimultaneously.
I.e.Ahostapplication(e.g.customsoftwaregeneratingsoundorvisuals)can
communicatewithmsaOscMLviaOSC.Itcansendmessagestocreateandtrainany
numberofmodels,andreceivepredictionsfrommultiplemodelssimultaneously.
CrossplatformandwritteninC++,msaOscMLusesaMachineLearning
ImplementationAbstractionLayer(MLIAL).Thisallowsdifferentmachinelearning
librariestopluginasabackendwithaminimalMLIALwrapper.Currentlythereare
MLIALwrappersforGilliansGRTframework,andSteffienNissensFastArtifficial
NeuralNetworkLibrary(FANN)[Nissen,2003].msaOscMLwaswrittenforandused
onanR&DinteractivedanceprojectcalledPatternRecognition[Akten,2015c].
Toolsliketheseenablebothtechnicalandnontechnicaluserstoquicklysetup,train
andtestmodelsforgesturerecognition.Withoutwritinganycode,userscanstart
streaminginputdatafromtheirsensors,andreceivepredictionsintheirapplicationof
choice,enablingthemtogesturallycreate,manipulateandperformaudiovisual
contentinrealtime.AnexampleofFiebrinksWekinatorcanbeseenintheband
000000SwansaudiovisualshowsgesturallydrivenusingaMicrosoftKinectand
commerciallyavailablesensorbow[Schedeletal.2011].Inadditionithasalsobeen
appliedincontextssuchasworkshopswithpeoplewithlearningandphysical
disabilities[Katanetal.2015].
Conclusions
Thefieldofalgorithmicallygeneratingimagesandsoundisaveryrichandwell
establishedfield.Expressiveinteractionparticularlyformusicisalsowell
establishedwithnewadvancedtechniquesemergingasthefieldismaturing.Deep
learningisgoingthroughanalmostrevolutionaryrevivalwithmanyrecent
developments.Thegaming,entertainmentandmediaindustriesareconvergingas
nextgenerationmultimodalinteractionandvirtual,augmentedandmixedreality
technologiesarebecomingmainstream.
Withinthiscontext,therearestillmanyunexploredterritorieswithalotofartistic
potential,especiallyattheintersectionsofthesetrends.Theseincludewaysof
generatingcontentusingdeeplearning,particularlyinrealtimeandinteractively.Also
applicationsofexpressiveinteractionstothegenerationofthecontent,particularly
withnextgenerationconsumerdevicessetinmixedrealityenvironments.
Bibliography
AKTEN,M.2015a.msaOscML.https://github.com/memo/msaOscML.
AKTEN,M.2015b.PatternRecognitionDance
Performance.http://www.memo.tv/patternrecognitionwip/.
AKTEN,M.,STEEL,B.,MCNICHOLAS,R.,ETAL.2011.SonyPlayStationVideoStore
Mapping..
ALRIFAIE,M.M.ANDBISHOP,J.M.2015.WeakandStrongComputational
Creativity.In:ComputationalCreativityResearch:TowardsCreativeMachines.014.
ALVES,B.2005.DigitalHarmonyofSoundandLight.ComputerMusicJournal29,4,
4554.
BENGIO,Y.,COURVILLE,A.,ANDVINCENT,P.2013.RepresentationLearning:A
ReviewandNewPerspectives.Tpami1993,130.
BEVILACQUA,F.ANDMULLER,R.2005.Agesturefollowerforperformingarts.
ProceedingsoftheInternationalGesture,34.
BEVILACQUA,F.,ZAMBORLIN,B.,SYPNIEWSKI,A.,SCHNELL,N.,GUDY,F.,
ANDRASAMIMANANA,N.2009.Continuousrealtimegesturefollowingand
recognition.LectureNotesinComputerScience(includingsubseriesLectureNotesin
ArtificialIntelligenceandLectureNotesinBioinformatics)5934LNAI,7384.
BOULANGERLEWANDOWSKI,N.,VINCENT,P.,ANDBENGIO,Y.2012.Modeling
TemporalDependenciesinHighDimensionalSequences:ApplicationtoPolyphonic
MusicGenerationandTranscription.Proceedingsofthe29thInternational
ConferenceonMachineLearning(ICML12)Cd,11591166.
BROWNE,C.ANDPOWLEY,E.2012.Asurveyofmontecarlotreesearchmethods.
IntelligenceandAI4,1,149.
CADOZ,C.ANDWANDERLEY,M.2000.Gesturemusic.Trendsingesturalcontrolof
music,7194.
CAMURRI,A.,MAZZARINO,B.,RICCHETTI,M.,TIMMERS,R.,ANDVOLPE,G.
2004.Multimodalanalysisofexpressivegestureinmusicanddanceperformances.In:
Gesturebased{C}ommunicationin{H}uman{C}omputer{I}nteraction,{LNAI}
2915.2039.
CARAMIAUX,B.2015.MotionModelingforExpressiveInteractionADesignProposal
usingBayesianAdaptiveSystems.InternationalWorkshoponMovementand
Computing(MOCO),IRCAM.
CARAMIAUX,B.,BEVILACQUA,F.,ANDSCHNELL,N.2009.Towardsagesture
soundcrossmodalanalysis.LectureNotesinComputerScience(includingsubseries
LectureNotesinArtificialIntelligenceandLectureNotesinBioinformatics)5934
LNAI,158170.
CARAMIAUX,B.,DONNARUMMA,M.,ANDTANAKA,A.2015.Understanding
GestureExpressivitythroughMuscleSensing.ACMTransactionsonComputer
HumanInteraction0,0,127.
Realtimeimage&soundsynthesis&expressivemanipulationusingDL&RLin
responsiveenvironments.MemoAkten,IGGI,LiteratureReview30/09/2015
21
CARAMIAUX,B.,MONTECCHIO,N.,TANAKA,A.,ANDBEVILACQUA,F.2014.
AdaptiveGestureRecognitionwithVariationEstimationforInteractiveSystems.ACM
TransactionsonInteractiveIntelligentSystems(TiiS)(InPress)V,212.
CARAMIAUX,B.ANDTANAKA,A.2013.MachineLearningofMusicalGestures.
ProceedingsoftheInternationalConferenceonNewInterfacesforMusical
Expression,513518.
CASSELL,J.ANDMCNEILL,D.1991.GestureandthePoeticsofProse.PoeticsToday
12,3,375404.
CAVALLO,F.,PEASE,A.,GOW,J.,ANDCOLTON,S.2013.UsingTheoryFormation
TechniquesfortheInventionofFictionalConcepts.176183.
CIRESAN,D.,MEIER,U.,ANDMASCI,J.2011.Flexible,highperformance
convolutionalneuralnetworksforimageclassification.InternationalJointConference
onArtificialIntelligence,12371242.
COHEN,H.1973.Paralleltoperception:somenotesontheproblemofmachine
generatedart.ComputerStudies,110.
COHEN,H.1994.TheFurtherExploitsofAaron,Painter..
COHEN,H.2006.AARON,Colorist:fromExpertSystemtoExpert..
COLLOBERT,R.,WESTON,J.,BOTTOU,L.,KARLEN,M.,KAVUKCUOGLU,K.,
ANDKUKSA,P.2011.Naturallanguageprocessing(almost)fromscratch.TheJournal
ofMachineLearningResearch1,12,24932537.
COLTON,S.,GOODWIN,J.,ANDVEALE,T.2012.FullFACEPoetryGeneration.
ProceedingsoftheThirdInternationalConferenceonComputationalCreativity
(ICCC12),95102.
COLTON,S.,PEASE,A.,ANDCHARNLEY,J.2011.Computationalcreativitytheory:
TheFACEandIDEAdescriptivemodels.ProceedingsoftheSecondInternational
ConferenceonComputationalCreativity,9095.
COLTON,S.ANDWIGGINS,G.A.2012.Computationalcreativity:Thefinalfrontier?
FrontiersinArtificialIntelligenceandApplications242,2126.
COOK,M.,COLTON,S.,ANDGOW,J.2014.AutomatingGameDesignInThree
Dimensions.AISBSymposiumonAIandGames,36.
COPE,D.H.2010.Recombinantmusiccompositionalgorithmandmethodofusing
thesame..
COUPRIE,C.,NAJMAN,L.,ANDLECUN,Y.2013.LearningHierarchicalFeaturesfor
SceneLabeling.PatternAnalysisandMachineIntelligence,IEEETransactionson35,
8,19151929.
COWIE,R.,DOUGLASCOWIE,E.,TSAPATSOULIS,N.,ETAL.2001.Emotion
recognitioninhumancomputerinteraction.SignalProcessingMagazine,IEEE18,1,
3280.
22
CRUZNEIRA,C.,SANDIN,D.,ANDDEFANTI,T.1993.Surroundscreenprojection
basedvirtualreality:thedesignandimplementationoftheCAVE.ofthe20Th
AnnualConferenceon,135142.
DENG,J.D.J.,DONG,W.D.W.,SOCHER,R.,LI,L.J.L.L.J.,LI,K.L.K.,ANDFEIFEI,
L.F.F.L.2009.ImageNet:Alargescalehierarchicalimagedatabase.2009IEEE
ConferenceonComputerVisionandPatternRecognition,29.
DENG,L.,HINTON,G.,ANDKINGSBURY,B.2013.NewTypesofDeepNeural
NetworkLearningforSpeechRecognitionandRelatedApplications:anOverview.
85998603.
DOSOVITSKIY,A.ANDBROX,T.2015.InvertingConvolutionalNetworkswith
ConvolutionalNetworks.115.
DOURISH,P.2001.WheretheActionIs:TheFoundationsofEmbodiedInteraction.
Wheretheactionisthefoundationsofembodiedinteraction36,
233.http://books.google.com/books?id=DCIy2zxrCqcC&pgis=1.
DRAVES,S.2005.TheElectricSheepscreensaver:Acasestudyinaesthetic
evolution.Proc.EvoMUSART,458467.
ECK,D.ANDSCHMIDHUBER,J.2002.Afirstlookatmusiccompositionusinglstm
recurrentneuralnetworks.IstitutoDalleMolleDiStudiSullIntelligenza.
ERHAN,D.,BENGIO,Y.,COURVILLE,A.,ANDVINCENT,P.2009.Visualizing
higherlayerfeaturesofadeepnetwork.Bernoulli1341,113.
FAILS,J.A.,OLSEN,JR.,D.R.,ANDOLSEN,D.R.2003.InteractiveMachine
Learning.Proceedingsofthe8thInternationalConferenceonIntelligentUser
Interfaces,ACM,3945.
FELS,S.S.ANDHINTON,G.E.1993.Glovetalk:aneuralnetworkinterfacebetweena
datagloveandaspeechsynthesizer.IEEETransactionsonNeuralNetworks4,1,28.
FIEBRINK,R.,TRUEMAN,D.,ANDCOOK,P.R.2009.Ametainstrumentfor
interactive,ontheflymachinelearning.Proc.NIME2,3.
FIEBRINK,R.A.2011.RealtimeHumanInteractionwithSupervisedLearning
AlgorithmsforMusicCompositionandPerformance.ImagineJanuary,376.
GATYS,L.A.,ECKER,A.S.,ANDBETHGE,M.2015.ANeuralAlgorithmofArtistic
Style.37.
GILLIAN,N.E.2011.GestureRecognitionforMusicianComputerInteraction.Social
SciencesMarch.
GOOGLEADVANCEDTECHNOLOGYANDPROJECTS.2014.Project
Tango.https://www.google.com/atap/projecttango/.
GOOGLEADVANCEDTECHNOLOGYANDPROJECTS.2015.Project
Soli.https://www.google.com/atap/projectsoli/.
23
GRIERSON,M.2005.Audiovisualcomposition.http://www.strangeloop.co.uk/Dr.
M.GriersonAudiovisualCompositionThesis.pdf.
GUO,X.,SINGH,S.,LEE,H.,LEWIS,R.,ANDWANG,X.2014.DeepLearningfor
RealTimeAtariGamePlayUsingOfflineMonteCarloTreeSearchPlanning.
AdvancesinNeuralInformationProcessingSystems(NIPS)272600,33383346.
GUZELLA,T.S.ANDCAMINHAS,W.M.2009.Areviewofmachinelearning
approachestoSpamfiltering.ExpertSystemswithApplications36,7,1020610222.
HINTON,G.,DENG,L.,YU,D.,ETAL.2012.DeepNeuralNetworksforAcoustic
ModelinginSpeechRecognition.IeeeSignalProcessingMagazineNovember,8297.
HOCHREITER,S.ANDSCHMIDHUBER,J.1997.Longshorttermmemory.Neural
computation9,8,173580.
ILYASUTSKEVER,ORIOLVINYALS,Q.V.LE.2014.SequencetoSequenceLearning
withNeuralNetworks.Nips,19.
JIA,Y.,SHELHAMER,E.,DONAHUE,J.,ETAL.2014.Caffe:Convolutional
ArchitectureforFastFeatureEmbedding.arXivpreprintarXiv:1408.5093.
JONES,B.,SHAPIRA,L.,SODHI,R.,ETAL.2014.RoomAlive:MagicalExperiences
EnabledbyScalable,AdaptiveProjectorcameraUnits.Proceedingsofthe27thannual
ACMsymposiumonUserinterfacesoftwareandtechnologyUIST14,637644.
JONES,B.R.,BENKO,H.,OFEK,E.,ANDWILSON,A.D.2013.IllumiRoom:
peripheralprojectedillusionsforinteractiveexperiences.ProceedingsoftheSIGCHI
ConferenceonHumanFactorsinComputingSystemsCHI13,869.
KARPATHY,A.2015a.charrnn.https://github.com/karpathy/charrnn.
KARPATHY,A.2015b.TheUnreasonableEffectivenessofRecurrentNeural
Networks..
KARPATHY,ANDREJ,L.F.F.2015.DeepVisualSemanticAlignmentsforGenerating
ImageDescriptions.Cvpr.
KATAN,S.,GRIERSON,M.,ANDFIEBRINK,R.2015.UsingInteractiveMachine
LearningtoSupportInterfaceDevelopmentThroughWorkshopswithDisabled
People.CHI15Proceedingsofthe33rdAnnualACMConferenceonHumanFactors
inComputingSystems.
KIEFER,C.2014.MusicalInstrumentMappingDesignwithEchoStateNetworks.
ProceedingsoftheInternationalConferenceonNewInterfacesforMusical
Expression,293298.
KOHLER,E.,KEYSERS,C.,UMILT,M.A.,FOGASSI,L.,GALLESE,V.,AND
RIZZOLATTI,G.2002.Hearingsounds,understandingactions:actionrepresentation
inmirrorneurons.Science(NewYork,N.Y.)297,5582,846848.
24
KRIZHEVSKY,A.,SUTSKEVER,I.,ANDHINTON,G.E.2012.ImageNet
ClassificationwithDeepConvolutionalNeuralNetworks.AdvancesInNeural
InformationProcessingSystems,19.
KRUEGER,M.W.,GIONFRIDDO,T.,ANDHINRICHSEN,K.1985.VIDEOPLACE
anartificialreality.ACMSIGCHIBulletin16,4,3540.
KYPRIANIDIS,J.E.,COLLOMOSSE,J.,WANG,T.,ANDISENBERG,T.2013.Stateof
theArt:Ataxonomyofartisticstylizationtechniquesforimagesandvideo.IEEE
TransactionsonVisualizationandComputerGraphics19,5,866885.
LAVIOLAJR.,J.J.2013.3DGesturalInteraction:TheStateoftheField.ISRN
ArtificialIntelligence20132013,2,118.
LE,Q.V.,RANZATO,M.A.,MONGA,R.,ETAL.2011.Buildinghighlevelfeatures
usinglargescaleunsupervisedlearning.InternationalConferenceinMachine
Learning,38115.
LEAPMOTION.LeapMotion.https://www.leapmotion.com/.
LECUN,Y.2012.Learninginvariantfeaturehierarchies.LectureNotesinComputer
Science(includingsubseriesLectureNotesinArtificialIntelligenceandLectureNotes
inBioinformatics)7583LNCS,PART1,496505.
LECUN,Y.2014.TheUnreasonableEffectivenessofDeepLearning.FacebookAI
Research&CenterforDataScience,NYU.
LECUN,Y.ANDBENGIO,Y.1995.Convolutionalnetworksforimages,speech,and
timeseries.Thehandbookofbraintheoryandneuralnetworks3361,255258.
LECUN,Y.,BOSER,B.,DENKER,J.S.,ETAL.1989.BackpropagationAppliedto
HandwrittenZipCodeRecognition.NeuralComputation1,541551.
LECUN,Y.,BOTTOU,L.,BENGIO,Y.,ANDHAFFNER,P.1998.Gradientbased
learningappliedtodocumentrecognition.ProceedingsoftheIEEE86,11,2278
2323.
LECUN,Y.,CORTES,C.,ANDBURGES,C.J.C.TheMNIST
Database.http://yann.lecun.com/exdb/mnist/index.html.
LECUN,Y.,HUANG,F.J.H.F.J.,ANDBOTTOU,L.2004.Learningmethodsfor
genericobjectrecognitionwithinvariancetoposeandlighting.Proceedingsofthe
2004IEEEComputerSocietyConferenceonComputerVisionandPattern
Recognition,2004.CVPR2004.2.
LECUN,Y.,JACKEL,L.,BOTTOU,L.,ETAL.1995.Comparisonoflearning
algorithmsforhandwrittendigitrecognition.InternationalConferenceonartificial
neuralnetworks,5360.
LEE,M.,FREED,A.,ANDWESSEL,D.1992.Neuralnetworksforsimultaneous
classificationandparameterestimationinmusicalinstrumentcontrol.Proceedingsof
SPIE1706,244255.
25
LEMAN,M.2007.EmbodiedMusicCognitionandMediationTechnology..
LEVIN,G.2000.PainterlyInterfacesforAudiovisualPerformance.Media,1151.
MAHENDRAN,A.ANDVEDALDI,A.2014.UnderstandingDeepImage
RepresentationsbyInvertingThem..
MCCORMACK,J.ANDDINVERNO,M.2012.ComputersandCreativity:TheRoad
Ahead.ComputersandCreativity,421424.
MCCORMACK,J.ANDDINVERNO,M.2014.OntheFutureofComputersand
Creativity..
MCNEILL,D.ANDLEVY,E.1980.Conceptualrepresentationsinlanguageactivity
andgesture..
METZINGER,T.ANDGALLESE,V.2003.Theemergenceofasharedactionontology:
Buildingblocksforatheory.ConsciousnessandCognition,549571.
MISTRY,P.ANDMAES,P.2009.SixthSense:awearablegesturalinterface.ACM
SIGGRAPHASIA2009Sketches,ACM.
MITAL,P.K.,GRIERSON,M.,ANDSMITH,T.J.2013.Corpusbasedvisualsynthesis.
ProceedingsoftheACMSymposiumonAppliedPerceptionSAP13July,5158.
MITCHELL,T.M.1997.MachineLearning.McGrawHill.
MITRA,S.ANDACHARYA,T.2007.GestureRecognition:ASurvey.IEEE
TransactionsOnSystems,Man,AndCyberneticsPartC:ApplicationsAndReviews
37,3,311324.
MNIH,V.,HEESS,N.,GRAVES,A.,ANDKAVUKCUOGLU,K.2014.Recurrent
ModelsofVisualAttention.Nips,112.
MNIH,V.,KAVUKCUOGLU,K.,SILVER,D.,ETAL.2013.PlayingAtariwithDeep
ReinforcementLearning.arXivpreprintarXiv:,19.
MORDVINTSEV,A.,OLAH,C.,ANDTYKA,M.2015.Deepdream
inceptionism.http://googleresearch.blogspot.ch/2015/06/inceptionismgoingdeeper
intoneural.html.
NG,A.2013.MachineLearningandAIviaBrainSimulations.StanfordUniversity.
NGUYEN,A,YOSINSKI,J.,ANDCLUNE,J.2015.DeepNeuralNetworksareEasily
Fooled:HighConfidencePredictionsforUnrecognizableImages.Cvpr2015.
NISSEN,S.2003.FastArtificialNeuralNetwork
Library.http://leenissen.dk/fann/wp/.
NVIDIA.2015.DeepLearningGPUTrainingSystem
(DIGITS).https://developer.nvidia.com/digits/.
RAINA,R.,MADHAVAN,A.,ANDNG,A.Y.2009.Largescaledeepunsupervised
learningusinggraphicsprocessors.Icml9,873880.
26
ROKEBY,D.1986.VeryNervousSystem..
SCHEDEL,M.,FIEBRINK,R.,ANDPERRY,P.2011.Wekinating000000Swan:
UsingMachineLearningtoCreateandControlComplexArtisticSystems.Proceedings
oftheInternationalConferenceonNewInterfacesforMusicalExpressionJune,453
456.
SCHMIDHUBER,J.2014.DeepLearninginNeuralNetworks:AnOverview.arXiv
preprintarXiv:1404.7828,166.
SEARLE,J.R.1980.Minds,Brains,andPrograms.BehavioralandBrainSciences3,1
19.
SIMONYAN,K.,VEDALDI,A.,ANDZISSERMAN,A.2013.DeepInside
ConvolutionalNetworks:VisualisingImageClassificationModelsandSaliencyMaps.
arXivpreprintarXiv:1312.6034,18.
SIMS,K.1994.Evolvingvirtualcreatures.Siggraph94SIGGRAPH,July,1522.
STURM,B.2015.RecurrentNeuralNetworksforFolkMusic
Generation.https://highnoongmt.wordpress.com/2015/05/22/lislsstisrecurrent
neuralnetworksforfolkmusicgeneration.
SZEGEDY,C.,LIU,W.,JIA,Y.,ETAL.2014.GoingDeeperwithConvolutions.arXiv
preprintarXiv:1409.4842,112.
THRUN,S.,MONTEMERLO,M.,DAHLKAMP,H.,ETAL.2006.Stanley:TheRobot
ThatWontheDARPAGrandChallenge.JournalofFieldRobotics23,9,661692.
TODD,S.ANDLATHAM,W.1992.Evolutionaryartandcomputers.AcademicPress,
Inc.
TURING,A.1948.IntelligentMachinery..
TURING,A.1950.ComputingMachineryandIntelligence.Mind59,433460.
WRIGHT,M.ANDFREED,A.1997.OpenSoundControl:Anewprotocolfor
communicatingwithsoundsynthesizers.ProceedingsoftheInternationalComputer
MusicConference(ICMC).
ZENG,Z.,PANTIC,M.,ROISMAN,G.I.,ANDHUANG,T.S.2009.Asurveyofaffect
recognitionmethods:Audio,visual,andspontaneousexpressions.IEEETransactions
onPatternAnalysisandMachineIntelligence31,1,3958.
Machine Learning Deep Learning Art

Review of Machine - Deep Learning in An Artistic Context - Machine Intelligence Report - Medium

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Review of Machine - Deep Learning in An Artistic Context - Machine Intelligence Report - Medium

Uploaded by

Copyright:

Available Formats

08/09/2016 Reviewofmachine/deeplearninginanartisticcontextMachineIntelligenceReportMedium

Brief overview of MachineLearning

Deep Learning and motivations

Very brief history of DeepLearning

Deep Learning for Artistic & CreativeOutput

Brief History of Algorithmic Computational Art (in

Machine Learning for Artistic, Expressive Human

Gesture Recognition forAEHCI

Interactive Machine Learning(IML)

Machine Learning Deep Learning Art

You might also like