데이터 인사이트 스터디 내용을 관리하는 블로그 입니다.

Data Insight Study Blog

  • Join Us on Facebook!
  • Follow Us on Twitter!
  • LinkedIn
  • Subcribe to Our RSS Feed

[스크랩] Process Mining

간밤에 수신된 이메일 정리하다가 흥미있는 내용이 있어서 스크랩.
 
"프로세스 마이닝"에 대한 내용인데요, 데이터 사이언티스트는 데이터 자체를 운영 프로세스와 연관시켜서, 이에 대한 올바른 질문을 할 수 있어야 한다(A data scientist also needs to relate data to operational processes and be able to ask the right questions)라는 책의 서문의 내용이 마음에 와 닿습니다. 분량이 꽤 되는데, 올해 읽을 수 있었으면 하는 희망입니다. ^^;
 

Concept:  

Process mining 

Main community page http://www.processmining.org/ 

Main/recent reference literature http://www.springer.com/gp/book/9783662498507 

Process Mining

Data Science in Action

Authors: van der Aalst, Wil M.P.  

From <http://www.springer.com/gp/book/9783662498507 

Potential Tools: 

Microsoft/Azure availble: 

Process Discovery 

·        Sequence Mining / Markov Chains 

§  Closest approximate match = Sequence Cluster Model (Microsoft Analysis Services, SQL Server 2016) 

·        Sequence clustering based on n-order Markov Chains

·        Use of Markov Chains cannot handle concurrency and need additional post processing

·        Sequence Mining requires to determine the max length of a sequence

·        Algorithm may be very compute intensive

·        Designed for

·        Clickstream analysis 

 

·        Neural Networks 

§  Azure Machine Learning:

·        Two-Class Neural Networks: only usefull for two-class classification

 

·        Visualisation: PowerBI (see KPMG)

§  https://powerbi.microsoft.com/en-us/partner-showcase/kpmg-processmining/ 

 

PROM 

Open source reference implementation (interactive) 

http://www.promtools.org/doku.php 

Support for:

§  Heuristic Mining

§  Genetic Mining

§  Fuzzy Mining

§  Region Mining

§  Inductive Mining

 

IMPORTANT REMARK

§  Some logic seems executable from CLI

·        https://dirksmetric.wordpress.com/2015/03/11/tutorial-automating-process-mining-with-proms-command-line-interface/

·        CLI facility used for unit testing

·        Should be same subset as exposed via RapidMiner 

 

RapidPROM 

Plugin for RapidMiner (Visual, allows to design process mining activity flows)  

giving access to PROM functionality from within Rapidminer 

Installation: http://www.promtools.org/doku.php?id=rapidprom:installation

Userguide: http://www.win.tue.nl/~abolt/userfiles/downloads/RapidProM/RapidProM_user_guide.pdf

 

IMPORTANT REMARK:

·        RapidMiner is expensive and seems to be one of the best tools around

·        https://rapidminer.com/pricing/

·        https://1xltkxylmzx3z8gd647akcdvov-wpengine.netdna-ssl.com/wp-content/uploads/2016/12/rm-platform-fact-sheet-12-2.pdf

·        RapidMiner with RapidProm seems heavely promoted to chain activities by Technische Universiteit Eindhoven (TU/e)

§  Avaiability

·        Commercial free download limitted to 10000 lines

·        Via https://my.rapidminer.com/nexus/account/index.html#downloads at http://go.rapidminer.com/rm-studio-download-windows-64bit

·        Unlimitted older open source versions available

·        via http://www.rapidprom.org/ at http://www.promtools.org/rapidprom/downloads/rapidminer-5.3.015x64-install.exe 

 

PMLAB 

Script driven process mining, not up to par with PROM in terms of algorithms 

https://github.com/josepcarmona/PMLAB 

Paper: http://ceur-ws.org/Vol-1295/paper4.pdf 

 

CRAN-R pMineR package 

https://cran.r-project.org/web/packages/pMineR/pMineR.pdf 

Supports (less advanced/evolved)

·        Alpha-Miner Model

·        First Order Markov Model

답글 기능이 비활성화되어 있습니다