Framework Of Page Segmentation For Mushaf Al-Quran Based

2y ago
26 Views
2 Downloads
1.91 MB
10 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Grant Gall
Transcription

International Journal of Computer Information Systems and Industrial Management Applications.ISSN 2150-7988 Volume 10 (2018) pp. 028-037 MIR Labs, www.mirlabs.net/ijcisim/index.htmlReceived: 21 Oct, 2017; Accept 8 Feb, 2018; Publish: 21 Feb, 2018Framework of Page Segmentation for MushafAl-Quran Based on Multiphase Level SegmentationAmirul Ramzani Radzid1*, Mohd Sanusi Azmi2, Intan Ermahani A. Jalil3, Azah Kamilah Muda4 and LaithBany Melhem5, Nur Atikah Arbain61*, 2, 3, 4, 5, 6 Facultyof Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Malaysia.1*amirulramzani@gmail.com, 2sanusi@utem.edu.my, 3ermahani@utem.edu.my, 4azah@utem.edu.my, tract: This paper presents the framework of pagesegmentation for Mushaf Al-Quran based on Multiphase LevelSegmentation (MLS). This study focuses to (a) extract multiformframe shape by using a novel technique Neighbouring PixelBehaviors (NPB) and (b) segment text line by using a noveltechnique which is Hybrid Projection Based NeighbouringProperties (HPBNP). Since Mushaf Al-Quran pages aredecorated with a different type of pattern and design of adecorative frame. Thus, the decoration frame must be properlyto extract out from a page of Mushaf Al-Quran first beforeproperly get only the text of Mushaf Al-Quran regardless of itsdecoration heterogeneity. Therefore, NPB technique wasproposed to remove multiform frame shape from the page ofMushaf Al-Quran. While the text of Mushaf Al-Quran has aseveral of diacritical marks, hence it will block the process ofsegmenting text line. Therefore, HPBNP technique was proposedfor segment overlapping text line that interfered by diacriticalmarks or the stroke of the Arabic word. Experimental results ofthe proposed technique is shown in this paper.Keywords: Page Segmentation, Frame Extraction, ExtractionMushaf Al-Quran Decoration, Mushaf Al-Quran Text Segmentation,Line segmentation.challenging due to many variations such as layout structure,decorations and etc. This paper proposed to establish a generic,flexible and multiform segmentation method to unrestricted ofdecoration frame and the overlapping component of the textline based on MLS.Some page of Mushaf Al-Quran contains variety form andshape of decoration frame. It is unnecessary to form in order toprettify the page that surrounded the text. It is crucial to extractout decoration frame from the page due to analyses the text. Infuture work, by analyzing manuscript decoration frameillumination can discover the information of specificmanuscript [5].Page layout can be divided into two classes which areoverlapping and nonoverlapping [6]. Overlapping can befound in text line or other component layouts. This paper isconcerned with overlapping text line that causes by interferingof diacritical marks or stroke of the Arabic word. Punctuationand diacritic symbols, which are located between text linesmake it more complicate deciphering the physical structure oftext lines [7]. While nonoverlapping text line components areapparently clear separated by white space.I. IntroductionII. Related workMushaf Al-Quran is the most preserved book in themankind history [1]. It is decorated with various decorationsthat meant to embellish the presentation of the Holy Quran.However, this decoration will degrade the authenticationprocess. Thus, page segmentation for Mushaf Al-Quran is animportant task to extract the only text of Al-Quran from thepages without making any changes to the content of theMushaf Al-Quran. Page segmentation is a preprocessing stagefor document analysis. It is considered as an important initialstep for document image analysis and understanding [2]. Adocument page contains several properties such as halftones,decoration, graphics, text or etc. which can be divided usingcolumns or block [3][4]. Columns or block can be classified indocument components such as texts, frames, lines, ornamentsand etc that can be segmented. Thus, this page segmentation isthe crucial step in order to understand the layout or content ofthe document. Page segmentation on Mushaf Al-Quran isDocument page analysis has two structure: Physical layoutand logical structure [8]. The logical structure can bedescribed as logical labels of document physical componentswhere these labels derived from a set of rules. While, thephysical layout can be described in various forms,independently of or jointly with document logical structure.These document structured analysis can be seen in studied byTsujimoto and Asada [9]. In their study represent documentthe physical layout and the logical structure of trees. By usinga set of generic transformation rules and a virtual fieldseparator technique they modeled document understanding asthe transformation of a physical tree into a logical one.Document page image physical layout analysis algorithmscan be categorized into three class: top-down approaches,bottom-up approaches and hybrid approaches [8] [10]. Thetop-down approach in page segmentation is segmenting largeregions into smaller sub-regions. Deng Cai [11] in his study forDynamic Publishers, Inc., USA

Framework Page Segmentation for Mushaf Al-Quran Based on Multiphase Level Segmentationa vision-based page segmentation algorithm used an automatictop-down, tag-tree independent approach to detect webcontent structure. Sukhvir Kaur [12] in his study mentionedthat the XY cut segmentation algorithm also stated as therecursive XY cuts (RXYC) algorithm and which is referred astree-based top-down algorithm [13]. On the other hand, thebottom-up approach starts by grouping pixels of interest thenmerging into larger blocks or connected components. Asstudied by Akiyama and Hagita [14] perform bottom-up layoutanalysis that works both global and local text features alongwith generic properties of documents. It is in a similar toFisher et al. [15] perform bottom-up segmentation in hisstudied [16]. While hybrid approaches is a combination oftop-down approaches and bottom-up approaches. Thisapproach can relate with Seyyed Yasser Hashemi [17] in hisstudy indicated that hybrid method for segmenting thePersian/Arabic document images used to solve the complexityof layout document. In this study, this paper indicatestop-down approach in order to segment page of MushafAl-Quran which is from a page into paragraph then paragraphinto text line.In 2016, Ha Dai-Ton et al. [18] in their study on adaptiveover-split and merge algorithm for page segmentation. In theirstudy, they had proposed an adaptive over-split and mergealgorithm to reduce simultaneously over-segmentation andunder-segmentation errors. While, in 2015, Kai Chen et al. [19]has studied on page segmentation of historical documentimages with convolutional autoencoders. On his paperproposed an unsupervised feature learning method for pagesegmentation available as color images in whereby appliedconvolutional autoencoders to learn features directly frompixel intensity values. On the other hand, in 2014, Kai Chen etal. [20] proposed another technique on page segmentation forhistorical handwritten document images using color andtexture features. They proposed a physical structure detectionmethod for the historical handwritten document. In 2016, KaiChen et al. [2] proposed another technique on pagesegmentation for historical document images based onsuperpixel classification with unsupervised feature learning.Besides that, in 2017, Kai Chen et al. [21] proposed anothertechnique which is convolutional neural networks for pagesegmentation of historical document images. In their paperpresents a CNN based page segmentation method forhandwritten historical document images. Based on thesestudies, those techniques unsuitable for Mushaf Al-Quranpages because Mushaf Al-Quran text contains overlappingcause by diacritics or stroke of the Arabic word and multiformframe shape.In 2013, T. Abu-Ain et al. [22] was proposed textnormalization in order for selection of the correct baselineregion. This study complies with seven main stages thatinvolved in order to straighten baseline and slant correction.This research is continuity from past paleography researchstudy. This study can relate to digital Jawi paleography field.Mohd Sanusi Azmi introduced features from trianglegeometry for digit recognition on Jawi paleography field [23].Moreover, this researcher applied his technique to Arabic orJawi. Thus, it can be related to this research topic because ofMushaf Al-Quran ware written in Arabic.On the other hand, this research is also continuity29pre-processing stage from studied of removing Al-Quranillumination [24]. This studied focusing on removingillumination from the text. Past study also has been done forframe illumination removal and text line segmentation onMushaf Al-Quran [25] [26] but the proposed techniques wereineffective to solve the problem.Arabic language that is used is a sacred language of MushafAl-Quran [27]. On the other hand, studied has done on Arabiccalligraphy classification [28]. This studied on Arabiccalligraphy classification of the ancient manuscripts can giveuseful information to paleographers. Thus, this study can beapplied on Mushaf Al-Quran for authentication purpose onfuture work.III. DatasetWe experimented with six different type of Mushaf Al-Quranfor multiform frame shape extraction. While for text linesegmentation, we use four different type of text line in MushafAl-Quran that contain overlapping. Table 1 shows dataset ofMushaf Al-Quran pages for experimenting multiform frameshape extraction. While Table 2 shows dataset of MushafAl-Quran text lines for experimenting text line segmentationthat contains overlapping.Number123456SourcePageImage of Al-Quran Al-Karim fromMawarsoft Digital Furqan 1.02Image of Al-Quran Al-Karim fromMushaf Al-Madinah Quran Majeed1Image of Al-Quran Al-Karim from KSU-Electronic Mosshaf1Image of Al-Quran Al-Karim fromMushaf Al-Madinah Quran Majeed3Image of Al-Quran Al-Karim fromMawarsoft Digital Furqan 1.04Image of Al-Quran Al-Karim fromUthmani Script Mushaf2Table 1. Dataset of Mushaf Al-Quran pages.NumberSourcePage RowMushaf Al-Quran RasmUthmani publish by company S611-131Abdul MajeedMushaf Al-Madinah Quran32Majeed3-5Mushaf Al-Madinah Quran36-83MajeedMushaf Al-Madinah Quran38-104MajeedTable 2. Dataset of Mushaf Al-Quran text lines.IV. Proposed MethodA. Pre-processingBefore processing, dataset must be prepared. Page of MushafAl-Quran used in this experiment is the collection of textimages from Mushaf Al-Quran that has been digitalized. Textimage of Mushaf Al-Quran must contain any decoration,illumination, illustration in order to segment multiform of theframe. Conventional steps for instance noise removal and

30Radzid, A.R et al.filtering comprise text normalization for example baselinecorrection, slant normalization and skew correction must beapplied. Those steps create the image to process more reliableand effective [29].At this phase, image from the page of Mushaf Al-Quranperforms preprocessing algorithm as data provision stage.This purpose is to improve and enhance the input image intothe uniform format which is binary form. Colored input imagewill convert into grey-scale format then it will convert intobinary format. The conversion process from grey-scale formatinto the binary format called binarization. This binarizationformat was refer studied by NB Venkateswarlu and RD Boyle[30] on their new segmentation techniques for documentimage analysis. The binary form will be labeled as “0” for theforeground while the background will be labeled as “1”.Thresholding method one of the important technique forimage preprocessing that converts a grey-scale image to createa binary image. Thresholding method used in this experimentwas conducted by using Otsu’s method proposed by ScholarOtsu in 1979 [31]. The concept of thresholding is to select anoptimal grey-level threshold value for separating objects ofinterest in an image from the background based on theirgrey-level distribution [31]. If g(x, y) is a threshold version off(x, y) at some global threshold T, it can be defined as [32]g(x, y) 1 if f(x, y) T 0 otherwise(1)Thresholding operation is defined as:T M [x, y, p(x, y), f (x, y)](2)In the equation as stated above (1) and (2) , T is stands for thethreshold; while f (x, y) is stand for the gray value of point (x,y) and p(x, y) represents as some local property of the pointsuch as the average gray value of the neighborhood centeredon point (x, y).B. Operational framework page segmentation methodIn this paper there is a three phase of segmentation method: a)input image and pre-processing image, b) frame extraction andtext line segmentation, c) result output and d) featureextraction and result validation. Figure 1 shows an operationalframework for page segmentation method. Input Image andPre-Processing Image phase have been explained in Section A.On the other hand, Frame Extraction and Text LineSegmentation phase will be explained in Section C. While,Result Output phase will be described in IV. ExperimentResult. The result of this experiment is in image form. Thisoutput image can extract its features to do validation andclassification on proposed techniques. Classification of thisexperiment was conducted by using Unsupervised MachineLearning (UML) that are used minimum Euclidean distanceand average accuracy mean [33].Figure 1. Operational framework page segmentation methodC. Page Segmentation MethodThis paper present page segmentation for Mushaf Al-Quranbased on Multiphase Level Segmentation (MLS). There aretwo proposed techniques on MLS indicated as a differentlevel of segmentation method: 1) Neighbouring PixelBehaviors (NPB) and 2) Hybrid Projection BasedNeighbouring Properties (HPBNP). NPB is present tosolving multiform frame shape extraction while HPBNP ispresent to solving text line segmentation. Figure 2 shows aproposed page segmentation method in this paper.Proposed Method:Page Segmentation for Mushaf Al-Quran Based onMultiphase Level Segmentation (MLS)

Framework Page Segmentation for Mushaf Al-Quran Based on Multiphase Level SegmentationProposed Technique 1:Multiform Frame ShapeExtraction UsingNeighbouring PixelBehaviors (NPB)Proposed Technique 2:Text Line SegmentationUsing Hybrid ProjectionBased NeighbouringProperties (HPBNP)31this, the point of intersection between borders of everyregion will be identified. It can be applied to a different typeof shapes and patterns of Mushaf Al-Quran decorationframe. For example, Figure 4 illustrates the point ofdetection on rectangle decoration frame, while Figure 5illustrates the point of detection on oval decoration frame.Figure 2. Proposed page segmentation method1) Multiform Frame Shape Extraction (MFSE)There are several challenges in extracting the significantinformation in existing Mushaf Al-Quran pages. One of thesignificant challenges is to extract text that containsdifferent patterns and texture of decorations surround it. Inorder to extract text, decoration frame must be properlyidentified from a page of Mushaf Al-Quran. Therefore,multiform frame shape extraction was proposed in order toextract from a page of Mushaf Al-Quran.This proposed method, multiform frame shape extractionusing Neighbouring Pixel Behaviors (NPB) can solve oneof the difficulties which are to extract frame decorationfrom a page. Without removing the decorations, the imagescan be mistakenly considered as part of Mushaf Al-Qurantexts. Thus, this study aims to automatically extract the textof Al-Quran from the images without making any changesto the content of the Mushaf Al-Quran. This is to ensure theextracted images are only the Mushaf Al-Quran textsregardless of Mushaf frame decoration heterogeneity. Thus,this study proposed a novel Neighbouring Pixel Behaviors(NPB) technique to address this problem.This technique will identify boundary regions. Gap orblank space regions between Arabic text (middle) anddecoration (side) which is known as boundary regions. Thealgorithm computes a wide range of every pixel area to beanalyzed which is 4% from the length of a page for verticalpoint and horizontal point that continually has the sameproperties of the pixel. Figure 3 shows an example ofboundary regions on the Mushaf Al-Quran page.Figure 3. Example of boundary regions on the MushafAl-Quran pageThe recognize boundary regions that locate outside textarea (middle regions) will be passed to the next process ofthe point of region detection. Four different regions ofinterest are focused in this study, which are page region,decoration region, boundary region and text region. WithFigure 4. Example point of detection on rectangledecoration frameFigure 5. Example point of detection on oval decorationframeFigure 4 and Figure 5 presented the information asbelow:(a) Point of detection on document page region.(b) Point of detection on decoration frame region.(c) Point of detection on boundary region.(d) Point of detection on text region.After the point of detection on regions is applied to

32Radzid, A.R et al.recognize the decoration frame, the point of recognitionpixels based on neighbouring pixels properties is taken thestep. It will cluster pixels which have same properties.Figure 6 shows an example cluster to identify pixels point.diacritical mark that which cause overlapping of text linesegmentation. This diacritical marks is an obstacle of duringtext line segmentation that causes overlapping as illustratedin Figure 8.Figure 8. Example of diacritical marks that cause ofoverlappingFigure 6. Cluster of pixels point identified by usingneighbouring pixels propertiesThe process of the point of recognition pixels is used toidentify balance regions of frame decoration. Balanceregions of frame decoration are depicted as Figure 7.This process can extract multiform of decoration framefrom the page of Mushaf Al-Quran. The result as shown insection V. Experiment Result.Fist step algorithm compute horizontal projection profilein order to calculate each pixel by row to project its graph asshown in Figure 9.Figure 9. Result of horizontal projection histogramSecond step algorithm computes object ownership inorder to calculate the distance of baseline or distance of thedetermined object.In order to determine the object of diacritical marksownership based on the distance of baseline, the algorithmwill calculate the gap between diacritical marks or stroke ofthe Arabic word with upper text baseline and bottom textbaseline. The nearest text baseline will be owned for theobject. The distance of baseline and object are depicted asFigure 10.Figure 7. Balance regions of frame decoration2) Text SegmentationText line segmentation is an important step in documentimage processing. Its part of the pre-processing stage toprepared the images before throughout either featureextraction or classification images. In this paper, we presenta novel technique of text line segmentation for MushafAl-Quran text by using Hybrid Projection BasedNeighbouring Properties (HPBNP). This is based on thepixel, object and histogram properties. This algorithm willidentify overlaps between neighboring text lines andsegment each line with precision. Overlap cause byinterfering of diacritical marks or stroke of the Arabic wordmust be properly segmented without change the originalmeaning of the text. Figure 8 shows an example of theFigure 10. Illustration of the distance between the object(diacritical marks or stroke of the Arabic word) withbaselineThe others process will be the distance of the determinedobject. In order to determine the object of diacritical marksownership based on the distance of the determined object,the algorithm will calculate the gap between diacriticalmarks or stroke of the Arabic word with upper determinedobject and bottom determined object. The nearest textbaseline will be owned for the object of diacritical marks.The distance of the object of diacritical marks objects andthe determined object of are depicted as Figure 11.

Framework Page Segmentation for Mushaf Al-Quran Based on Multiphase Level SegmentationFigure 11. Illustration of the distance between the object(diacritical marks or stroke of the Arabic word) with thenearest objectLastly, the algorithm will segment text line determinedbase on horizontal projection profile to detect its number ofbaseline. Then, it will consider the lower peak of contour asoverlap. For overlap, it will determine object possession todetermine its row number of the text line. The pseudocodeis defining as shown in Figure 12.1.02.03.04.05.06.0StartRead input imageInput image pre-processing imageDetect baseline using horizontal projection profileFabricate object using neighbouring pixel propertiesDetermine object possession5.1 Define object possession using distance ofbaseline5.2 Define object possession using determinedobject7.0 Output result image8.0 EndFigure 12. Pseudocode Hybrid Projection BasedNeighbouring Properties (HPBNP)D. Feature extraction and result validation.The result of this processing is in the form of images andbinaries to facilitate the further process which is featureextraction. The resulting image results obtained will beextracted using the Geometric Triangle Using BackgroundForeground Image (STDIL) that has been suggested by N.A. Arbain et al. [34]. Experimental results are produced bycomparing the results of the present techniques with theprior proposed techniques using unsupervised machinelearning (UML). The UML used are minimum Euclideandistance and average accuracy mean (AAM). The result ofthis phase does not state in this paper will be stated infurther research.33of the verse (Taskil) are misinterpreted as part ofillumination or ornament, whereas in this study the objectend of the verse is part of the text to guide as the end of theverse and the number of verse in Al-Quran. It also can beused later in further study to segmenting the verse of textMushaf Al-Quran. Moreover, in this study will remove alltext outside from decoration frame including the name ofsurah at the top of the page that does not effect on the ayahof Mushaf Al-Quran. This study also differs from pastresearch that focuses on the different domain. Table 3shows the result of multiform frame extraction using theproposed method which is Neighbouring Pixel Behaviors(NPB). Our proposed method can identify or recognized thedifferent shape of decoration on Mushaf Al-Quran page.Source(ReferTable 1)Multiform FrameInput Image extraction (BinaryFormat)12345V. EXPERIMENT RESULT AND DISCUSSIONThe experiment was implemented in Java and tested on theselected dataset of Mushaf Al-Quran as stated in section IIIDataset. The result from the proposed method wascompared with Binary Representation (BR) techniques thatproposed by L. B. Melhem in 2015 [25] and 2017 [26].Table 3. Result of Multiform Frame Identification.A. Comparison Frame Extraction and RemovalIn order to remove the multiform shape from MushafAl-Quran page, it must identify at first. Most research isfocusing on removing illumination or ornament [2] [20] [21][35] from the page. Past research has shown the object endMost related study for removing frame in domain MushafAl-Quran has been done by L. B. Melhem in 2015 [25] byusing Binary Representation (BR). Unfortunately, BRunsuccessful to remove decoration frame from imagesource 1 (Image of Al-Quran Al-Karim from Mawarsoft6

34Radzid, A.R et al.Digital Furqan 1.0 page 2), source 2 (Image of Al-QuranAl-Karim from Mushaf Al-Madinah Quran Majeed page 1)and source 3 (Image of Al-Quran Al-Karim from KSU Electronic Mosshaf). This is because the method proposedby researcher only can be solved on the rectangle shape.While on this research solve multiform frame on pageMushaf Al-Quran as shown on Table 4. Table 4 shows aresult comparison of multiphase for experimentingmultiform frame shape extraction.Source(ReferTable 1)Result ofAl-Quran as shown in Table 5 - Table 8. The result showedthat proposed method HPBNP can solve overlappingproblem.Table 5 - Table 8 shows dataset of Mushaf Al-Quran textlines for experimenting text line segmentation that containsoverlapping.InputResult ofInputBinaryProposedImage Representation [25] Method (NPB)1cannot beprocessedRow 11-13Result ofBinaryRepresentation[26] [25]Row 11Row 12-132cannot beprocessedResult ofProposedMethod(HPBNP)3Row 11Row 12cannot beprocessedRow 134Table 5. Result of text image of Mushaf Al-Quran RasmUthmani publish by company S Abdul Majeed page 6.Input5Row 3-56Result ofBinaryRepresentation[26] [25]Row 3Table 4. Result comparison of multiphase for pagesegmentation.Row 3-5B. Comparison Text SegmentationComparison with BR techniques [26] is made because theirresearch about text segmentation is in the same domainwhich is Mushaf Al-Quran. However, the proposedtechniques were ineffective to solve the problem whichMushaf Al-Quran text. This is because Mushaf Al-Qurantext contains diacritical marks and stroke of the Arabicword will cause overlapping. While this research proposedtext segmentation to solve overlapping text on page MushafResult ofProposedMethod(HPBNP)Row 3Row 4

35Framework Page Segmentation for Mushaf Al-Quran Based on Multiphase Level SegmentationRow 10Row 5Table 6. Result of text image of Mushaf Al-Quran fromMushaf Al-Madinah Quran Majeed page 3.Table 8. Result of text image of Mushaf Al-Quran fromMushaf Al-Madinah Quran Majeed page 3.VI. CONCLUSIONInputRow 6-8Result ofBinaryRepresentation[26] [25]Row 6Row 7-8Result ofProposedMethod(HPBNP)Row 6Row 7Row 8Table 7. Result of text image of Mushaf Al-Quran fromMushaf Al-Madinah Quran Majeed page 3.The result is for multiform frame shape extraction arecompared with Binary Representation technique that wasproposed by L.B. Melhem [25] with the same dataset asshown in Table 4. Dataset that are being used forconducting this experiment are shown in Table 1. The resultis shown that the proposed method named NeighbouringPixel Behaviors (NPB) for multiform frame shapeextraction is more efficient to solve the problem comparethan prior research.The result for text line segmentation are compared withL.B. Melhem [26] with the same dataset as shown in Table5, Table 6, Table 7 and Table 8. The dataset that is beingused for conducting this experiment is shown in Table 2.The result is shown that the proposed method named HybridProjection Based Neighbouring Properties (HPBNP) fortext line segmentation are more efficient to solve theproblem compare than prior research.Feature work for this study will be verse segmentation.Object end of the verse (Taskil) will be guided to segmentfull sentence of the verse. This proposed method will beapplied to conduct verse segmentation.InputRow 8-10Result ofBinaryRepresentation[26] [25]In this paper, we present a framework for pagesegmentation for Mushaf Al-Quran based on MultiphaseLevel Segmentation (MLS). This study focusing to extractmultiform frame shape by using the novel technique whichis Neighbouring Pixel Behaviors (NPB) and segment textline by using the novel technique which is HybridProjection Based Neighbouring Properties (HPBNP). NPBtechnique will remove multiform frame shape from the pageof Mushaf Al-Quran. While HPBNP technique willsegmenting overlapping text line that caused of interferingwith diacritical marks or stroke of the Arabic word.Row 8AcknowledgmentThe authors would like to express their appreciation to theUniversiti Teknikal Malaysia Melaka for the scholarship ofZamalah UTeM Scheme. Thank also to the Faculty ofInformation Technology and Communication for providingthe excellent research faculties and facilities.ReferencesRow 9-10Result ofProposedMethod(HPBNP)[1][2]Row 8Row 9[3]C. Paper, C. A. Language, and P. D. View, “DataPreparation and Handling for Written Quran ScriptVerification,” no. October, 2016.K. Chen, C.-L. Liu, M. Seuret, M. Liwicki, J.Hennebert, and R. Ingold, “Page Segmentation forHistorical Document Images Based on rning,” in 2016 12th IAPR Workshop onDocument Analysis Systems (DAS), 2016, pp.299–304.T. Pavlidis and J. Zhou, “Page segmentation and

7][18][19][20]Radzid, A.R et al.classification,” CVGIP Graph. Model. ImageProcess., vol. 54, no. 6, pp. 484–496, Nov. 1992.A. K. Jain and Y. Zhong, “Page segmentation usingtexture analysis,” Pattern Recognit., vol. 29, no. 5,pp. 743–770, 1996.M. S. Azmi and K. Omar, “Features Extraction ofArabic Calligraphy using extended Triangle Modelfor Digital Jawi Paleography Analysis,” Int. J.Comput. Inf. Syst. Ind. Manag. Appl., vol. 5, pp.696–703, 2013.K. Kise and A. Sato, “Page Segmentation Using theArea Voronoi Diagram,” Tech. Rep. IEICE. PRMU,vol. 96, no. 598, pp. 9–16, 1997.R. Saabni, A. Asi, and J. El-Sana, “Text lineextraction for historical document images,” PatternRecognit. Lett., vol. 35, no. 1, pp. 23–33, 2014.S. Mao, A. Rosenfeld, and T. Kanungo, “DocumentStructure Analysis Algorithms: a LiteratureSurvey,” SPIE 5010, Doc. Recognit. Retr. X, vol.5010, no. 1, p. 197, 2003.S. Tsujimoto and H. Asada, “Understandingmulti-articled documents,” in [1990] Proceedings.10th International Conference on PatternRecognition, 1990, vol. i, no. 4, pp. 551–556.Song Mao and T. Kanungo, “Empiricalperformance evaluation methodology and itsapplication to page segmentation algorithms,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no.3, pp. 242–256, 2001.D. Cai, S. Yu, J. R. Wen, and W. Y. Ma, “VIPS: avisionbased page segmentation algorithm,” BeijingMiciosoft Res. Asia, pp. 1–29, 2003.S. Kaur, P. Mann, and S. Khurana, “PageSegmentation in OCR System-A Review,”Ijcsit.Com, vol. 4, no. 3, pp. 420–422, 2013.G. Nagy, S. Seth, and M. Viswanathan, “APrototype Document Image Analysis System forTechnical Journals,” Computer (Long. Beach.Calif)., vol. 25, no. 7, pp. 10–22, 1992.T. Akiyama and N. Hagita, “Automated entrysystem for printed documents,” Pattern Recognit.,vol. 23, no. 11, pp. 1141–1154, Jan. 1990.G. Nagy, S. Set

Mushaf Al-Madinah Quran Majeed 8-10 Table 2. Dataset of Mushaf Al-Quran text lines. IV. Proposed Method A. Pre-processing Before processing, dataset must be prepared. Page of Mushaf Al-Quran used in this experiment is the collection of text images from Mushaf Al-Quran that has been digita

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

Internal Segmentation Firewall Segmentation is not new, but effective segmentation has not been practical. In the past, performance, price, and effort were all gating factors for implementing a good segmentation strategy. But this has not changed the desire for deeper and more prolific segmentation in the enterprise.

Internal Segmentation Firewall Segmentation is not new, but effective segmentation has not been practical. In the past, performance, price, and effort were all gating factors for implementing a good segmentation strategy. But this has not changed the desire for deeper and more prolific segmentation in the enterprise.

segmentation research. 2. Method The method of segmentation refers to when the segments are defined. There are two methods of segmentation. They are a priori and post hoc. Segmentation requires that respondents be grouped based on some set of variables that are identified before data collection. In a priori segmentation, not only are the