toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Souhail Bakkali; Sanket Biswas; Zuheng Ming; Mickael Coustaty; Marçal Rusiñol; Oriol Ramos Terrades; Josep Llados edit   pdf
url  openurl
  Title TransferDoc: A Self-Supervised Transferable Document Representation Learning Model Unifying Vision and Language Type (up) Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The field of visual document understanding has witnessed a rapid growth in emerging challenges and powerful multi-modal strategies. However, they rely on an extensive amount of document data to learn their pretext objectives in a ``pre-train-then-fine-tune'' paradigm and thus, suffer a significant performance drop in real-world online industrial settings. One major reason is the over-reliance on OCR engines to extract local positional information within a document page. Therefore, this hinders the model's generalizability, flexibility and robustness due to the lack of capturing global information within a document image. We introduce TransferDoc, a cross-modal transformer-based architecture pre-trained in a self-supervised fashion using three novel pretext objectives. TransferDoc learns richer semantic concepts by unifying language and visual representations, which enables the production of more transferable models. Besides, two novel downstream tasks have been introduced for a ``closer-to-real'' industrial evaluation scenario where TransferDoc outperforms other state-of-the-art approaches.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ BBM2023 Serial 3995  
Permanent link to this record
 

 
Author Ruben Tito; Khanh Nguyen; Marlon Tobaben; Raouf Kerkouche; Mohamed Ali Souibgui; Kangsoo Jung; Lei Kang; Ernest Valveny; Antti Honkela; Mario Fritz; Dimosthenis Karatzas edit   pdf
url  openurl
  Title Privacy-Aware Document Visual Question Answering Type (up) Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Document Visual Question Answering (DocVQA) is a fast growing branch of document understanding. Despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees.
In this work, we explore privacy in the domain of DocVQA for the first time. We highlight privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions.
Specifically, we focus on the invoice processing use case as a realistic, widely used scenario for document understanding, and propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We employ a federated learning scheme, that reflects the real-life distribution of documents in different businesses, and we explore the use case where the ID of the invoice issuer is the sensitive information to be protected.
We demonstrate that non-private models tend to memorise, behaviour that can lead to exposing private information. We then evaluate baseline training schemes employing federated learning and differential privacy in this multi-modal scenario, where the sensitive information might be exposed through any of the two input modalities: vision (document image) or language (OCR tokens).
Finally, we design an attack exploiting the memorisation effect of the model, and demonstrate its effectiveness in probing different DocVQA models.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ PNT2023 Serial 4012  
Permanent link to this record
 

 
Author Daniel Marczak; Sebastian Cygert; Tomasz Trzcinski; Bartlomiej Twardowski edit  url
openurl 
  Title Revisiting Supervision for Continual Representation Learning Type (up) Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In the field of continual learning, models are designed to learn tasks one after the other. While most research has centered on supervised continual learning, recent studies have highlighted the strengths of self-supervised continual representation learning. The improved transferability of representations built with self-supervised methods is often associated with the role played by the multi-layer perceptron projector. In this work, we depart from this observation and reexamine the role of supervision in continual representation learning. We reckon that additional information, such as human annotations, should not deteriorate the quality of representations. Our findings show that supervised models when enhanced with a multi-layer perceptron head, can outperform self-supervised models in continual representation learning.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes xxx Approved no  
  Call Number Admin @ si @ MCT2023 Serial 4013  
Permanent link to this record
 

 
Author Jose Luis Gomez; Manuel Silva; Antonio Seoane; Agnes Borras; Mario Noriega; German Ros; Jose Antonio Iglesias; Antonio Lopez edit   pdf
url  openurl
  Title All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes Type (up) Miscellaneous
  Year 2023 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract We introduce UrbanSyn, a photorealistic dataset acquired through semi-procedurally generated synthetic urban driving scenarios. Developed using high-quality geometry and materials, UrbanSyn provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation with object bounding boxes and occlusion degree. It complements GTAV and Synscapes datasets to form what we coin as the 'Three Musketeers'. We demonstrate the value of the Three Musketeers in unsupervised domain adaptation for image semantic segmentation. Results on real-world datasets, Cityscapes, Mapillary Vistas, and BDD100K, establish new benchmarks, largely attributed to UrbanSyn. We make UrbanSyn openly and freely accessible (this http URL).  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ GSS2023 Serial 4015  
Permanent link to this record
 

 
Author Mustafa Hajij; Mathilde Papillon; Florian Frantzen; Jens Agerberg; Ibrahem AlJabea; Ruben Ballester; Claudio Battiloro; Guillermo Bernardez; Tolga Birdal; Aiden Brent; Peter Chin; Sergio Escalera; Simone Fiorellino; Odin Hoff Gardaa; Gurusankar Gopalakrishnan; Devendra Govil; Josef Hoppe; Maneel Reddy Karri; Jude Khouja; Manuel Lecha; Neal Livesay; Jan Meibner; Soham Mukherjee; Alexander Nikitin; Theodore Papamarkou; Jaro Prilepok; Karthikeyan Natesan Ramamurthy; Paul Rosen; Aldo Guzman-Saenz; Alessandro Salatiello; Shreyas N. Samaga; Simone Scardapane; Michael T. Schaub; Luca Scofano; Indro Spinelli; Lev Telyatnikov; Quang Truong; Robin Walters; Maosheng Yang; Olga Zaghen; Ghada Zamzmi; Ali Zia; Nina Miolane edit   pdf
url  openurl
  Title TopoX: A Suite of Python Packages for Machine Learning on Topological Domains Type (up) Miscellaneous
  Year 2024 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order cells; TopoEmbedX provides methods to embed topological domains into vector spaces, akin to popular graph-based embedding algorithms such as node2vec; TopoModelx is built on top of PyTorch and offers a comprehensive toolbox of higher-order message passing functions for neural networks on topological domains. The extensively documented and unit-tested source code of TopoX is available under MIT license at this https URL.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ HPF2024 Serial 4021  
Permanent link to this record
 

 
Author German Barquero; Sergio Escalera; Cristina Palmero edit   pdf
url  openurl
  Title Seamless Human Motion Composition with Blended Positional Encodings Type (up) Miscellaneous
  Year 2024 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Conditional human motion generation is an important topic with many applications in virtual reality, gaming, and robotics. While prior works have focused on generating motion guided by text, music, or scenes, these typically result in isolated motions confined to short durations. Instead, we address the generation of long, continuous sequences guided by a series of varying textual descriptions. In this context, we introduce FlowMDM, the first diffusion-based model that generates seamless Human Motion Compositions (HMC) without any postprocessing or redundant denoising steps. For this, we introduce the Blended Positional Encodings, a technique that leverages both absolute and relative positional encodings in the denoising chain. More specifically, global motion coherence is recovered at the absolute stage, whereas smooth and realistic transitions are built at the relative stage. As a result, we achieve state-of-the-art results in terms of accuracy, realism, and smoothness on the Babel and HumanML3D datasets. FlowMDM excels when trained with only a single description per motion sequence thanks to its Pose-Centric Cross-ATtention, which makes it robust against varying text descriptions at inference time. Finally, to address the limitations of existing HMC metrics, we propose two new metrics: the Peak Jerk and the Area Under the Jerk, to detect abrupt transitions.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ BEP2024 Serial 4022  
Permanent link to this record
 

 
Author Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal edit   pdf
url  openurl
  Title GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation Type (up) Miscellaneous
  Year 2024 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements. Large and complex models, while achieving high accuracy, can be computationally expensive and memory-intensive, making them impractical for deployment on resource constrained devices. Knowledge distillation allows us to create small and more efficient models that retain much of the performance of their larger counterparts. Here we present a graph-based knowledge distillation framework to correctly identify and localize the document objects in a document image. Here, we design a structured graph with nodes containing proposal-level features and edges representing the relationship between the different proposal regions. Also, to reduce text bias an adaptive node sampling strategy is designed to prune the weight distribution and put more weightage on non-text nodes. We encode the complete graph as a knowledge representation and transfer it from the teacher to the student through the proposed distillation loss by effectively capturing both local and global information concurrently. Extensive experimentation on competitive benchmarks demonstrates that the proposed framework outperforms the current state-of-the-art approaches. The code will be available at: this https URL.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ BBL2024b Serial 4023  
Permanent link to this record
 

 
Author Petia Radeva; Jordi Vitria; Fernando Vilariño; Panagiota Spyridonos; Fernando Azpiroz; Juan Malagelada; Fosca de Iorio; Anna Accarino edit   pdf
url  openurl
  Title Cascade analysis for intestinal contraction detection Type (up) Patent
  Year 2009 Publication US 2009/0284589 A1 Abbreviated Journal USPO  
  Volume Issue Pages 1-25  
  Keywords  
  Abstract A method and system cascade analysisi for intestinal contraction detection is provided by extracting from image frames captured in-vivo. The method and system also relate to the detection of turbid liquids in intestinal tracts, to automatic detection of video image frames taken in the gastrointestinal tract including a field of view obstructed by turbid media, and more particulary, to extraction of image data obstructed by turbid media.  
  Address  
  Corporate Author US Patent Office Thesis  
  Publisher US Patent Office Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; OR; MV;SIAI Approved no  
  Call Number IAM @ iam @ RVV2009 Serial 1700  
Permanent link to this record
 

 
Author Panagiota Spyridonos; Fernando Vilariño; Jordi Vitria; Petia Radeva; Fernando Azpiroz; Juan Malagelada edit   pdf
url  openurl
  Title Device, system and method for automatic detection of contractile activity in an image frame Type (up) Patent
  Year 2011 Publication US 2011/0044515 A1 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract A device, system and method for automatic detection of contractile activity of a body lumen in an image frame is provided, wherein image frames during contractile activity are captured and/or image frames including contractile activity are automatically detected, such as through pattern recognition and/or feature extraction to trace image frames including contractions, e.g., with wrinkle patterns. A manual procedure of annotation of contractions, e.g. tonic contractions in capsule endoscopy, may consist of the visualization of the whole video by a specialist, and the labeling of the contraction frames. Embodiments of the present invention may be suitable for implementation in an in vivo imaging system.  
  Address Pearl Cohen Zedek Latzer, LLP, 1500 Broadway 12th Floor, New York (NY) 10036 (US)  
  Corporate Author US Patent Office Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MV;OR;MILAB;SIAI Approved no  
  Call Number IAM @ iam @ SVV2011 Serial 1701  
Permanent link to this record
 

 
Author Fernando Vilariño; Panagiota Spyridonos; Petia Radeva; Jordi Vitria; Fernando Azpiroz; Juan Malagelada edit   pdf
url  openurl
  Title Method for automatic classification of in vivo images Type (up) Patent
  Year 2010 Publication US 2010/0046816 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract A method for automatically detecting a post-duodenal boundary in an image stream of the gastrointestinal (GI) tract. The image stream is sampled to obtain a reduced set of images for processing. The reduced set of images is filtered to remove non-valid frames or non-valid portions of frames, thereby generating a filtered set of valid images. A polar representation of the valid images is generated. Textural features of the polar representation are processed to detect the post-duodenal boundary of the GI tract.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes MV;OR;MILAB;SIAI Approved no  
  Call Number IAM @ iam @ VSR2010 Serial 1702  
Permanent link to this record
 

 
Author Gerard Lacey; Fernando Vilariño edit   pdf
url  openurl
  Title Endoscopy system with motion sensors Type (up) Patent
  Year 2011 Publication US 2011/0032347 A1 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract An endoscopy system (1) comprises an endoscope (2) with a camera (3) at its tip. The endoscope extends through an endoscope guide (4) for guiding movement of the endoscope and for measurement of its movement as it enters the body. The guide (4) comprises a generally conical body (5) having a through passage (105) through which the endoscope (2) extends. A motion sensor comprises an optical transmitter (7) and a detector (8) mounted alongside the passage (105) to measure the insertion-withdrawal linear motion and also rotation of the endoscope by the endoscopist's hand. The system (1) also comprises a flexure controller (10) having wheels operated by the endoscopist. The camera (3), the motion sensor (7/8), and the flexure controller (10) are all connected to a processor (11) which feeds a display.  
  Address Jacobson Holman PPLC; 400 Seventh Street, N.W. Suite 600; Whashington DC 20004 DC  
  Corporate Author USPTO Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes MV;SIAI Approved no  
  Call Number IAM @ iam @ LaV2011 Serial 1703  
Permanent link to this record
 

 
Author Fernando Vilariño; Panagiota Spyridonos; Petia Radeva; Jordi Vitria; Fernando Azpiroz; Juan Malagelada edit   pdf
url  openurl
  Title Device, system and method for measurement and analysis of contractile activity Type (up) Patent
  Year 2009 Publication US 2009/0202117 A1 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract A method and system for determining intestinal dysfunction condition are provided by classifying and analyzing image frames captured in-vivo. The method and system also relate to the detection of contractile activity in intestinal tracts, to automatic detection of video image frames taken in the gastrointestinal tract including contractile activity, and more particularly to measurement and analysis of contractile activity of the GI tract based on image intensity of in vivo image data.  
  Address Pearl Cohen Zedek Latzer  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes MV;OR;MILAB;SIAI Approved no  
  Call Number IAM @ iam @ VSR2009 Serial 1704  
Permanent link to this record
 

 
Author Michal Drozdzal; Petia Radeva; Santiago Segui; Laura Igual; Carolina Malagelada; Fernando Azpiroz; Jordi Vitria edit  openurl
  Title System and Method for Improving a Discriminative Model Type (up) Patent
  Year 2012 Publication US 61/450,886 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Given Imaging  
  Corporate Author US Patent Office Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; OR;MV Approved no  
  Call Number Admin @ si @ DRS2012a Serial 1896  
Permanent link to this record
 

 
Author Francesc Tanarro Marquez; Pau Gratacos Marti; F. Javier Sanchez; Joan Ramon Jimenez Minguell; Coen Antens; Enric Sala i Esteva edit   pdf
url  openurl
  Title A device for monitoring condition of a railway supply Type (up) Patent
  Year 2012 Publication EP 2 404 777 A1 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract of a railway supply line when the supply line is in contact with a head of a pantograph of a vehicle in order to power said vehicle . The device includes a camera ( for monitoring parameters indicative of operating capability of said supply line.
The device is intended to monitor condition
tive of operating capability of said supply line. The device includes a reflective element. comprising a pattern , intended to be arranged onto the pantograph head . The camera is intended to be arranged on the vehicle (10) so as to register the pattern position regarding a vertical direction.
 
  Address  
  Corporate Author ALSTOM Transport SA Thesis  
  Publisher European Patent Office Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MV Approved no  
  Call Number IAM @ iam @ MMS2012 Serial 1854  
Permanent link to this record
 

 
Author Michal Drozdzal; Santiago Segui; Petia Radeva; Jordi Vitria; Laura Igual edit  openurl
  Title System and Method for Displaying Motility Events in an in Vivo Image Stream Type (up) Patent
  Year 2011 Publication US 61/592,786 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Given Imaging  
  Corporate Author US Patent Office Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; OR;MV Approved no  
  Call Number Admin @ si @ DSR2011 Serial 1897  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: