CIP-StudyGuide-2023-Domain-1.pdf
Document Details
Uploaded by SpellboundTiger
2023
Tags
Full Transcript
DOMAIN 1: The AIIM Official CIP STUDY GUIDE Creating, Capturing, and Sharing Information Are you the next Certified Information Professional? This study guide contains the body of knowledge necessary to help prepare you for the Cer...
DOMAIN 1: The AIIM Official CIP STUDY GUIDE Creating, Capturing, and Sharing Information Are you the next Certified Information Professional? This study guide contains the body of knowledge necessary to help prepare you for the Certified Information Professional Exam for information professionals to be successful in the Intelligent Information Management era. The AIIM Official CIP STUDY GUIDE Domain 1: Creating, Capturing, and Sharing Information Introduction We begin this chapter by reviewing how organizations create, capture, and share information, from a variety of sources, for a variety of purposes. We start with a look at multi-channel capture – all the different ways in which organizations receive information and the impacts they make on capturing and managing it. We look at document management and collaboration, and how they allow organizations to create and manage information more efficiently and effectively. Once information is created, it needs to be captured somewhere. We review different types of information solutions and their capabilities and limitations. Finally, some information needs to be kept for extended periods – often longer than the lifecycle of the systems that create or store it. We’ll take a look at digital preservation. Collectively, creating, capturing, and sharing information form the first step in the intelligent information management lifecycle. Indeed, it sets the stage for everything that follows: extracting intelligence from information, digitalizing information-intensive processes, and even automating governance and compliance. And, of course, we implement information management systems in support of creating and capturing information and ensuring its usefulness to the organization. What this means is that if an organization does not put effective processes in place to create, capture and share information, the results will often get very messy very quickly. Often, organizations’ information stores lack the type of control and governance needed to reduce the complexity and this leads to information being created and stored haphazardly, with multiple copies and versions all over the organization. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Selecting the Appropriate File Format for Business Applications Common Business File Formats There are literally thousands of file formats available – it often seems like every application uses its own file format. Different file formats work better to meet certain business requirements, and selecting the wrong format can cause issues for organizations, their customers, their legal team, etc. Here are some very common file formats used in almost every organization. We’ll look at each of these in a bit more detail in the next few sections. Adobe Portable Document Format (PDF) PDF has been around since 1993 as a way to share rich documents, including formatting, links, and images. Over the years the format has been updated a number of times; its proprietary nature has resulted in significant issues with backwards compatibility. In 2005 Adobe produced a subset of PDF, the PDF/Archive (PDF/A) which was standardized as ISO 19005-1. The purpose of PDF/A is to provide a stable format suitable for archiving, in part by prohibiting features that change the look of the document such as active code or font linking. The intent is that this should make PDF/A suitable for long-term preservation where such efforts are directed to faithfully reproducing a digital document in the future. This standard is regularly updated. In 2008 Adobe went further and made the PDF Specification a standard as ISO 32000- 1. Adobe has also produced other specialized PDF formats including PDF/Engineering, PDF/X for prepress digital exchange, and PDF/UA for universal accessibility. Today PDF is a widely used format for a variety of applications because of its support for multiple pages and a variety of content types within a single PDF and because it is natively supported in almost all web browsers through the ubiquitous PDF Reader. Many scanning and content creation applications can output directly to PDF. Tagged Image File Format (TIFF) Tagged Image File Format, or TIFF, is a graphical format presenting the document as a digital copy of the original using raster images. It is an ISO standard. It supports many different compression formats, particularly lossless ones – that is, all of the original data remains present in the file. This can result in significantly larger file sizes compared to other approaches. TIFF was the most common format found in digital imaging applications because it was the first to be based on industry wide standards. It was the default file format for many scanners and digital imaging applications for a number of years. It supports black and white (bitonal), grayscale, and color scanning and can create multi-page files as well. TIFF is not as popular today compared to PDF for scanning office documents for two main reasons. First, TIFF files are not searchable without taking additional steps to perform character recognition on them. This is often built into the PDF capture process. Second, the ubiquity of the PDF Reader means that PDFs are viewable on almost any device including mobile devices. TIFF readers and plugins are much less common. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Joint Photographic Experts Group (JPEG) The Joint Photographic Experts Group, or JPEG, is both a standard compression algorithm and the file format that uses that algorithm. JPEG works by discarding up to 99% of color information that can’t be discerned by the eye. This works best for continuous tone images such as digital photographs; it does not work as well for black and white images such as scanned business documents. JPEG is an ISO standard format. JPEG is considered a “lossy” algorithm since data is actually discarded during the compression process. This is generally not an issue when creating a digital image but can become a problem if the image is repeatedly converted. While JPEG can support some methods for displaying multiple pages in a single file, these are not very well supported in the marketplace. JPEG has become much more common as mobile scanning and capture applications have matured – it is often the default file format for those devices and applications. Portable Network Graphics (PNG) Portable Network Graphics format, or PNG, is a more recent graphics format that supports very efficient, lossless compression and up to 32-bit color, making it very desirable for web graphics. Many graphics programs including those in digital photography applications support the creation of PNG formats; as with JPEG, PNG is natively supported in almost all current web browsers. PNG was originally published as a W3C standard in 1996 and as an ISO standard in 2004. Microsoft Office (Word, Excel, PowerPoint) Finally, we take a look at Microsoft Office. While there are other office productivity suites out there, including very good open source offerings, none of them has achieved the market share that Microsoft Office has – in fact, in some ways Office defines that market space. Office includes a number of tools; the composition of these tools changes over time as do the individual tool capabilities. For this course we will limit our review to the three most common formats: Word, Excel, and PowerPoint. Word is commonly used for creating and collaborating on business documents such as reports and contracts. Excel is a spreadsheet that can be used for financial calculations as well as presenting information in tabular format. PowerPoint is used to summarize and present information succinctly. Each tool offers a broad spectrum of capabilities that can make office workers more productive – but these broad capabilities also result in significant complexity in terms of what can be included in a given file or document. Office file formats are all considered proprietary; there are standards-based XML versions of each file format, but they are much less commonly used than the default formats. Microsoft’s sheer domination of the market ensures some compatibility between versions and among other tools, but complex authoring can result in potential incompatibilities in the future and in accessing significantly older versions of files. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Selecting the Right File Format So how do you know which file format to use? It really depends on the business needs of the department creating the information, and, ultimately, the needs of the organization. If you want to make information available online you should select a format that can be displayed in a variety of browsers, isn’t too large or cumbersome, etc. If you’re looking for an archival format, that is likely a different format that will depend on the original. If your customers use Office, it will be difficult to engage them using an incompatible office productivity suite or format. Standard file formats are preferred where possible, especially where you need to exchange information with others or when you need to retain information for long periods of time. As we’ll see in another module, the more complex a file format is, the harder it is to use over time; proprietary formats start out with more complexity. Proprietary File Formats Introduction In the market today there are many different file formats available – in fact it seems like almost every application has its own file formats. One of the ways to look at these is based on whether, or how much, they are based on standards. Standards are generally good to follow because they represent consensus on a particular set of capabilities, behaviors, etc. On the other hand, proprietary file formats are specific to a particular application and sometimes even to a particular version of an application. On the other hand, proprietary file formats are specific to a particular application and sometimes even to a particular version of an application. Proprietary File Formats Most organizations don’t give much thought to the file formats used to store their information – and this can cause problems in the short and long term when those formats are very complex and proprietary. There are two ways to look at proprietary file formats depending on their market share. For formats with significant market share, such as Adobe PDF, there are often other applications that claim to create compatible files. Whether the files are in fact compatible often depends on what is meant by “compatible” – for a simple document this may be easily done, but for a highly complex document it is likely that there will be issues. On the other hand, for formats that are more niched, there may be no other way to access those documents than through the provider’s application or service. If those shut down or change significantly, there is a real possibility of losing access to that information, and the longer it’s retained, the more likely this will be the case. At the same time, however, it’s important to consider that converting a file from one format to another, especially from a highly complex proprietary format to a more open, but perhaps less functional one, presents some issues as well, especially with regards to authenticity and trustworthiness over time. The CIP needs to balance these two issues carefully, both in the short term and over time. We discuss long- term access to information in another module. Whenever possible, CIPs should recommend standards-based file formats as they will tend to be more accessible over time. If this is not possible, selecting a proprietary file format with a larger market share will help to preserve access to that information in the short term. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Introduction to Capture Identifying Process Entry Points for Information For any information-centric process, that information has to enter the process from somewhere. Where it comes from can make a difference because it can lead to certain assumptions about the information: its format, its quality, its state of approval, and so on. And for some types of processes, the fact that a piece of information has entered the organization may start certain workflows or responsiveness requirements. Process entry points include: n Email. n Paper mail/documents. n Fax. n Voice / voicemail. n Websites and web-based forms. n Internal and external workflows. n Smartphones, tablets, and mobile apps. n Uploads to a portal or file sharing solution. n And every other channel used to create or transmit information. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: That said, we can start with the basics: internal and external. Internal. Internal information can be created specifically for/as part of a process, or it can enter a process as the output of another process. n Example 1: An employee creates an expense report, scans and uploads receipts to that report, and submits it to the expense report approval process. n Example 2: The finance department aggregates all of the approved expense reports for the month, separates the expenses by charge code, and charges them to the appropriate departments on the monthly financial statement. External. External information has to be submitted to the organization, and then to a particular process, by someone. This could be an automated process – for example, a customer fills out a loan or insurance application online. There may be supporting documentation required of the customer, depending on the particular process and transaction. Once the initial application requirements are complete, the application is routed automatically to the applicable process for further processing. Point of Service One of the key considerations here and for multichannel capture more broadly is that, to paraphrase AIIM research, we want to “digitize everything that moves” and do so as early as possible. Every step that involves paper is less efficient than it should be. So ideally this capture and digitization happens at the point of service or the point of the transaction. This is not a production imaging process, but rather a very decentralized one that leverages standalone desktop scanners, multifunction devices, or mobile devices, and applications. Here’s a very common example: It used to be that in order to deposit a check, a customer would have to go to a bank and talk to a human being. Even ATM deposits required the creation of a deposit slip, placing the check to be deposited in an envelope with that slip, and waiting several days for it to be processed. Today many ATMs will scan the front and back of the check as it’s fed into the machine and that money can be made available much more quickly. And there are mobile banking apps that will allow end users to take their own pictures of checks and use those to make a deposit directly through the app. Key Considerations Regardless of the specific process entry point, but especially as applied to point of service capture processes, there are some things to keep in mind to ensure the information received can be processed effectively: completeness; format and quality; and chain of custody. There needs to be some sort of process step to ensure the completeness of the submission or transmission. That step also needs to account for the expected formats to be received and the overall quality of those documents, especially scanned images. And these lead to a need for documentation of how the information was received and what happened to it – that is, a chain of custody. Depending on the process and the organization this may be more or less formalized, but there should still be some way to track where a particular piece of information is and that it “has” been received. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Determining the Best Points of Capture for Different Kinds of Information Capture is the process of getting information from its source into some type of more formal information management environment, system, application, or business process, and then recording its existence in the system. This includes scanning or otherwise converting a paper document into a digital form. To record the existence of the information in the information management system, you need to include relevant data about the paper document, other physical object or digital file that has been captured. This ‘data about data’ is known as metadata; we will discuss metadata in greater detail elsewhere in this course. Sources of Content For the CIP, the focus is on enterprise information. Enterprise information can take any number of forms. We may think of typical office documents such as spreadsheets or contracts, but any content with business value can be worth managing in an information management system. Think beyond the office suite to email, messaging or chat systems, text messages, or engineering drawings. The popularity of collaboration platforms and social media tools means more business communication is occurring in those platforms. Rich media types such as video, audio, or digital photographs are also becoming increasingly common. They are useful ways for organizations to share information and communicate with employees, partners, citizens, and customers. And we can’t forget the legacy world of paper. Many organizations still use paper-based forms, hand- written signatures and other physical media to represent business transactions. We must still recognize and manage this content or plan for its digitization where appropriate. Information can be in any digital file format – data, text, images, audio, video, as well as others, and even non- electronic content can be managed. It is the content of that information, not the format, that is significant when thinking about information management. We know that enterprise content can come from a wide range of authoring tools. Modern organizations are also realizing that the ways in which content is created, retrieved, and read also has changed. The work world is becoming increasingly mobile, and information management professionals need to look at the kinds of devices and applications their information workers use. As desktop PCs have given way to laptops and notebooks, so may these devices give way to tablets and increasingly sophisticated smartphones. As new mobile and portable devices improve usability and gain enterprise acceptance, information management professionals need to plan for the short-term future and understand how and where enterprise content is originating. The requirements and preferences of internal and external constituents for creation, access, and management of enterprise information should be considered. Expensive high-volume scanning hardware is also giving way to smaller departmental tools, and better quality, inexpensive multi-function devices. Sophisticated smartphones have camera and snapshot capabilities that are beginning to see adoption by financial services companies for use cases such as check deposit. An increasing number of applications using smartphones to capture documents are becoming available. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Nor is paper disappearing from the mail rooms and file rooms of large companies in transaction-based businesses such as insurance, banking, or public sector. The use of paper for transmitting information is declining which causes some organizations to outsource the digitization of paper documents to service providers. And content continues to be created in line of business applications such as Customer Relationship Management (CRM) or Human Resources tools. This is ripe potential for integration, import, and interoperability as your information management system deployment progresses. The Point of Capturing Information The entire point of capturing information is to establish some sort of control and context around it and ingesting it for automated processing in transactional processes. The most effective way to do this is through formally capturing and storing information in some sort of repository. We address repositories in more detail elsewhere in this course, but in effect it’s a place, often a database, where information can be stored, retrieved, accessed, and ultimately managed through its entire lifecycle. Capturing information into a repository allows the organization to establish control over that information. Security and access control lists can be set as to who can retrieve, manipulate, or even delete it. It also allows the organization to establish context around that information. What process is it part of? Who processed it and in what fashion? When is it captured? What other documents, records, or processes does it relate to? Who owns and who uses it? Capture at the Point of Service We mentioned in another module that the best time to capture information is often at the initial point of service or transaction, because every step that takes place at the speed of paper is less efficient than it should be. This will help to reduce transaction times and the overall costs of the process, making it more efficient for the organization as well as for any customers or partners involved. If it can’t be captured directly at the point of service, it should still be digitized (if applicable) and captured as early in the process as possible. This should be the general rule regardless of the type or source of information. That said, there are some more specific approaches to consider. Point of Capture – Business Documents For business documents such as Microsoft Office or Adobe Acrobat PDFs, there are a couple of options for when to capture them: n At the point of service or transaction. This would especially apply to documents received from elsewhere. n Automatically as part of a workflow, such as a review and approval workflow. In this case once the business rules have been satisfied, the final deliverable would be captured into a repository according to those rules. This would also apply to rich media types of business documents such as videos, photographs, or other deliverables with complex authoring and approval processes. n As part of the authoring processes. Instead of creating a document on the desktop, consider changing the process so that documents are created within a document or content management solution. As soon as the document is created it’s saved, versioning can be applied, etc., and it can be managed more effectively throughout the authoring process as well. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Point of Capture – Scanned Images For business documents that come into the organization in paper format, they need to be digitized as soon as possible. This means that in a transactional environment, paper documents should be digitized as they are received and then uploaded to the repository of record. In a production imaging environment where hundreds or thousands of pages are being scanned every day, the imaging process should include a step for releasing the images – to another process, to a repository, or even to a file share location. Point of Capture – Email Email has been a challenging information type for decades for many reasons: n Very high volume for most users n Very high volume of junk to sift through (solicitations, spam, spyware, Reply-All responses, etc.) n Granularity and terseness of some messages n Verbosity – multiple topics, multiple pages, all the replies in the thread – of other messages n Informality – so many email messages contain information you would never say to someone’s face; n And of course, attachments and attachment spam For all these reasons, but especially the first ones, the best approach to capture here is often to archive all email sent or received, or at least all of them sent to certain roles or functions, at the email server as they are being sent and received. This is a very broad-based approach but as we’ll see in a more in-depth discussion of email management later, manual approaches to email simply do not work. Point of Capture – Structured Data In most organizations, structured data is “captured” into its own application. Every line of business application has its own data tables, and perhaps its own database, and information is captured as it is entered into the system. In some cases, this structured data is used to generate reports, which may in turn need to be captured and managed. These may need to be captured manually or through a workflow of some kind to ensure all the data is present. Structured data can also be captured more transactionally through the use of forms, whether paper- based or digital forms. In the case of paper-based forms, the most efficient way to capture this data accurately is through scanning the forms and then using recognition technologies to automatically extract and capture that data. Digital forms may have their data captured directly into a structured line of business application. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Introduction to Document Management So what is document management? It is the use of a software application to track digital documents from creation through approval and publication. It serves in many ways to apply a formal governance framework to the document creation and collaborative editing processes. Today document management is generally incorporated as a set of capabilities in a broader information management solution. We will address those broader solutions in another module in this course. Document management capabilities are also key parts of the document control discipline, which is beyond the scope of this course. Traditional document management includes the following capabilities: n Check-in/check-out n Version control n Roll back n Security and access controls n Audit trails Next we’ll look at each of these in a bit more detail. Check-in/Check-out Check-in and check-out are very similar to how a library works – when a book is checked out, nobody else has access to it until it is checked back in. In document management, a user can check out a document in order to make changes to it. While the document is checked out, nobody else can edit it, and, depending on the solution, it may not even be accessible in a read-only mode. Once the user has made any desired changes, the user checks the document back in. At this point a new version of the document is created – we’ll discuss that shortly – and the document is unlocked and available for other users to review and/or check out. The point of this capability is to ensure that multiple users aren’t editing the same document simultaneously and overwriting each other’s changes. Version Control As the name suggests, version control is used to manage or control different versions of a document as it goes through the authoring and approval process. New versions are automatically created through auto- save, by saving the document manually, or, in this specific context, by checking the document back in. Some systems support major and minor versions of documents, while others simply consider any changes to result in a new version of the document. In either case, the system will also allow authorized users to compare different versions of the document to see what changes were made between any two versions of the document. This feature also reduces the need to store multiple copies and versions, and their associated naming conventions, in order to retain a document’s history. This manual approach – changing file names to e.g., “mydocument_final.doc” – is often overlooked or not of value, with the result that even with those naming conventions nobody is certain as to which version is the current, final, or approved one. Version control results in storing one document, with all of its versions, in one location so there is no confusion. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Roll Back Many systems that offer document control capabilities offer the ability to roll back or revert to the previous version. This is often done when a version is released prematurely or with some sort of error. This is commonly seen in web content management and software development repositories as opposed to documents, but it is seen in some case management and contract management solutions as well. Security and Access Controls Security and access controls help to ensure that any changes made to a document are done only by authorized users. Some users might be able to make changes directly to a document, while others might be limited to only commenting on the document and still others to read-only access. They also help to provide accountability, as any changes made are also linked to the individual(s) who made them. Audit Trails Audit trails show what has happened in a system. In the context of document management, audit trails can track every change to a document throughout its lifecycle, including who made what changes, when, and in what sequence. As with security and access controls, this helps to ensure accountability and transparency in the authoring and approval process. Systems of Record Let’s start by defining a system of record. Wikipedia defines it as “an information management system that is the authoritative source for a given data element or piece of information.” This is not a recordkeeping system per se, although it can be. Rather, this is the place where a particular type of information is stored. Ultimately, the organization should have a system of record identified for every type of business information it creates, receives, and manages – i.e., “a place for everything and everything in its place.” Where to Store Information? So which system is right for you or your organization? There is no right answer because each organization, and each department or process within the organization, has different business needs. Records need to be stored more formally than drafts. Rich media has unique requirements for storage and retrieval compared to scanned documents. Personnel files are more sensitive compared to other types of information. However, there is a sort-of right answer, which is that for any specific type of information, there should be a place designated for its storage. It doesn’t matter *what* that system is necessarily. Rather, it matters that users understand that there is an answer to their question, “Where do I store my files/documents/ stuff?” These answers in turn should be part of the governance framework, most likely in process and procedural documentation, and users should be trained and checked on regularly to ensure they are storing information in the appropriate location. And that location should be a repository of some sort. Any repository is likely to be more fit for purpose than networked file shares, which in turn will always be better than storing information on users’ individual computers, mobile devices, flash drives, and so on. Similarly, legal, risk management, compliance, records management, IT, and so forth want to understand where and how information is stored so they can perform the tasks they need to do in support of the organization as well. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: The Capture Process There are two distinct capture processes we will look at: paper and born-digital. There are some similarities, and once paper has been scanned it would generally follow the born-digital capture process, but there are some unique tasks to review in the paper capture process. The Paper Capture Process So, let’s begin with the process for capturing, or digitizing, paper documents. The digitization process includes several sub-processes: Documenting these processes – and the organization’s adherence to them – is the key to ensuring that the scanned images meet the organization’s legal, evidentiary, and business requirements. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: The Born-Digital Capture Process Now, let’s look at the process for capturing born-digital information. You will need to make a reasonably quick assessment of exactly how digital documents and records are created, received, and used and where the key risks are in terms of information that needs to be captured into the information management repository or to a business process then into a repository. Once you have identified your potential risk areas, you will need to examine the organization and its work processes to determine exactly what types of information need to be captured. By doing this, you will also gain a much better idea of the business transactions that take place in the organization and the processes for creating and using information. Next, you need to determine when in the process to capture a particular type of information. It is best to capture information at the earliest point of entry into an organization to allow for automated digital processing. After all, capturing drafts and capturing final documents or records would occur at different points in the creation and publishing process. Next, you will need to determine how to capture information effectively. Where possible this should be an automated process. Once decisions have been made about what to capture and how to do it, those decisions should be defined in procedures that can be used with appropriate tools to capture specific digital records. You will also need to address longer term sustainability issues covering the capture of digital records to ensure access to them over time. The Capture Format The decision to capture a particular type of information is only a part of the process. The organization will need to determine in what format that information should be captured. There are several options: n Original native format. This is the preferred approach because it is the most faithful representation of the transaction. In case of legal challenge this is also the most likely to be requested by counsel or auditors. n Commonly used format. For example, an organization may have a number of different versions of Microsoft Office or Adobe Acrobat in place. Some documents might be created in the most current version but then saved to a final format that is earlier version. n Standards-based format. This approach requires users to save the record into a non-proprietary format such as plain text, HTML, or PDF/A. This is better for ensuring long-term access to the information but may come at the expense of losing certain functionality. This can be a significant issue for complex or proprietary file formats. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Why Manual Capture Doesn’t Work Many organizations operate in an environment in which users are encouraged, expected, or required to identify and capture their own information related to their business process. There is some value to this approach – users are most knowledgeable about their business processes and activities and should be the best-positioned to determine what is important and where to store it. But the reality is something different. In a majority of organizations, the majority of users do not identify and capture and manage their information properly. They simply don’t. There are a number of reasons for this. First, every organization is doing more with less. Where users already feel overwhelmed by their workloads, it is very difficult to expect them to periodically stop what they are doing to do this “information management” thing. And this is probably more true the more senior the user in question – which is even more of an issue because they are more and more likely to be creating important documents and records that document decisions, set strategy, etc. No matter how much they are trained, most users simply don’t see information management as a priority – or certainly not higher than the actual work they are doing. And many of them don’t get training. Or they get a 30-min training when they are onboarded and maybe a refresher 30-min session every year or two. More importantly, users are very likely to classify or assign the things inconsistently over time, and the more complex the classification structure, the more likely this is to be the case. If you don’t think this is an issue, take a look at your own computer, your own folder structures on your desktop or where you store your documents, and especially your own email inbox. Have you been consistent in the way you save and classify messages over time? And there is always the possibility of an error: the user drags the record into the wrong folder, or makes a typographical error in a key metadata field, etc. So, for all these reasons, it is vital that organizations streamline and automate the capture process to the maximum extent possible. Now let’s look at some ways we can do this for different types and sources of information. Automating Capture There are a number of ways that the capture of born-digital information can be automated. n Bulk import. This most often occurs when migrating information from one location or system to another. n Workflow. One of the steps in a workflow could be to take action on a particular document. For example, once the final document has been approved, it can be converted to PDF, the PDF posted to the website, and the original finalized document declared a record and moved to the records repository. n Content types, or document types, or record types, or whatever term a given repository uses. This approach provides for the definition of characteristics that define a certain information type such as invoices, contracts, etc. Every document that meets that definition is automatically processed in a certain way. n Analytics. This approach leverages machine learning to understand the contents of documents or records and process them according to that understanding. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Approvals The capture process may also need to address approvals. When paper documents are scanned, the final step in the quality control process is approval that the images were captured correctly. This is generally a less formal, more tacit approval process. When more complex documents are being captured, such as contracts with references, or engineering documents with external reference files, it’s often valuable to have a more formalized approval process. This could be done through a review and approval-type workflow, or through a more manual process. Either way, it’s important to ensure that all parts of the document are complete and accurate, and that any supporting documentation that is required has been captured as well. Auditing the Capture Process No matter what approach or approaches you take to capturing information, it is important to audit the capture process periodically to make sure that information is being captured accurately and consistently. This audit needs to include the documents or records themselves, but it should also include metadata. After all, while capturing information can help safeguard it, it’s not very useful if it cannot be located or retrieved. Capture Process Metrics Here are some capture process metrics that can be captured as part of the audit process. These metrics may help make the business case for effective information management. n The percentage of information captured into a particular repository – documents, records, and metadata. n Growth rates in terms of volume and in the rate of growth. This will help the organization plan for future storage needs. n Access and retrieval rates – how much information is actually accessed and used. n The percentage of information that has been captured but no longer has business value – or its value is unknown. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Requirements for Multichannel Capture Multichannel Capture – Definition Simply put, multichannel capture is capture of different types of information from a variety of sources. These range from traditional production or desktop scanners to multifunction devices, to mobile devices and applications, and everything in between. And it’s not just the capture of paper through a scanning process: multichannel capture may also include information received via web forms or website uploads; email attachments; and even fax and structured print streams. Ideally multichannel capture can work in every channel through which the organization receives information. Ultimately the value of a document is in its content, not whether it was received as an email attachment, captured via a smartphone or tablet, or scanned using a multifunction device. Types of Multichannel Capture Multichannel capture also describes the point and approach to capture. It certainly includes traditional production capture but extends beyond that to include ad hoc capture at the point of service or of a transaction. And it includes on demand capture, often using a multifunction device or mobile device. Classification and Routing Once the content enters the organization, it will go through the multichannel hub. Here, the content is identified, classified, and routed to the appropriate work process. Different platforms take different approaches to this. Some solutions apply specific templates, rules, and analysis per channel; others attempt to apply a uniform set of rules and analysis regardless of the format or source of the content. Classification and routing can be done through static rules and template-based recognition. The less advanced approach is based on OCR (keywords) and rules and regular expressions. The more advanced approach is used in IDP, and includes technology like Machine Learning and AI, Natural Language Processing and Semantic Analysis. Solutions need to be trained on specific document types, which may require large training sets, which are difficult to collect. The native AI platforms and in particular the latest AI tools require significantly smaller training sets. Once the content is identified and classified, it can be sent automatically to the appropriate workflow if one exists. And a key piece of the capture process is assigning metadata to the document. This can be automated as well using many of the same techniques such as data extraction, i.e., identifying the relevant data fields and providing them to the respective business process. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Security Security is even more important when digital information can come in from so many sources. As things come in, the organization needs to implement effective processes and protocols to ensure for example that personal medical information doesn’t sit on an open fax or in a shared inbox. It may be necessary to redact this information for downstream processing in addition to restricting access to the document – both in the analog and digital world. Incoming digital documents need to be scanned for viruses and malware, too, to make sure they don’t cause information security issues. And access controls need to be set up to ensure that only authorized users have access to the incoming information, and only for the purposes they need it for. For example, someone assembling a mortgage application based on the customer’s submissions probably doesn’t need to be able to edit the documents submitted. Quality Control Quality control is always important, but it becomes even more important as organizations incorporate more of these ad hoc capture processes, perhaps using untrained “operators” (aka customers, line of business managers, or anyone whose main job doesn’t involve scanning) and a variety of capture hardware including personal smartphones. So the first thing to look at is the quality of the actual captured content. There needs to be a step in the process, manual or automated, that ensures that the source content is captured effectively. Images can’t be blurry or out of focus. The entire document needs to be captured. And so forth. Metadata also needs to be checked. While character recognition technologies are mature and better every year, they are still not perfect and can vary widely depending on the source and operator. Something as simple as a one-off mismatch can result in every single document’s metadata being incorrect. Security needs to be checked periodically as well to make sure only authorized users are accessing and interacting with the captured content. Security of the various inputs and channels should be checked periodically as well. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: The Digitization Strategy for Paper-based Information Digitization Strategies There are four primary strategies for digitization: full back file conversion, partial back file conversion, day forward, and scan on demand. Next, we will define and compare each of these. Full back file conversion. The “back file” is a term generally used to describe inactive paper documents which might be stored in a file room, records center, or even an offsite warehouse. In this approach every paper document in that back file is digitized and, often, the paper originals are destroyed. We discuss the legality of this in another module. This is the most comprehensive, but also the most expensive. If one of the drivers of the initiative is to reclaim office space, this may be required. In addition, some documents, particularly older ones, may be poor quality, offsized, etc. which will drive up the cost and resources required. Partial back file conversion. In this approach only some paper documents are digitized. Which ones are and aren’t could be a function of age – for example, only scanning one year of the back file and leaving the rest as paper. It could also be related to a business function – for example, scanning invoices and contracts but not other types of documents. This approach is not as expensive, but could lead to users needing to search both physical and digital storage locations in order to find everything. In addition, if the primary benefit or business case is from reclaiming storage space, this approach is less beneficial. Day forward. In this approach the organization sets a date and scans everything after that date. Paper documents older than that date are left in their original format. This approach can be very cost-effective because only newer, probably better quality, probably current documents are scanned. The biggest drawback is that, depending on the documents required, users may still need to search in more than one location, but this is offset somewhat by the fact that users will know where things are before and after the day forward date. In addition, it is vital that the organization be able to keep up with digitizing the volume of new or incoming documents. And this doesn’t really save any current storage space – though it could help to reduce the need for additional storage space in the future. Scan on demand. There are two ways to do this. One is to leave everything paper and only scan documents that are requested or accessed. This is the absolute cheapest approach, but it adds steps to each individual business transaction. The other is to combine day forward and scan on demand. The organization would scan everything after, say, January 1 of the current year, but would also scan older documents that are requested or accessed. This ensures that active documents do get scanned, while inactive ones do not. This is a very cost-effective approach and focuses on active documents while not worrying about inactive documents. The downside is that this provides minimal storage space savings, and as noted may increase the time to process certain transactions. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Which to Choose? How do you choose which approach will work best? It depends first on the purpose of the digitization initiative – is it to make content available online? Is it to reclaim storage space? Is it to make it easier to use the documentation in question to support customer service, or analysis, or something else? One differentiator is whether the documents in question are active or inactive and what the relative volumes are – it makes more sense to digitize active documents as opposed to documents that have not been touched in 10 years. It’s also important to consider the value of the information contained in those documents – which also goes to inactivity. You don’t want to spend the resources to digitize records that are within a few months of their disposition date. We can summarize these two points as determining the overall value of the documents in question, and then prioritizing those with ongoing business value over those with less or no value. A production scanning process is a lot more involved than a couple of people and a multi-function device or desktop scanner. Access to internal resources, including staff and space, needs to be considered, as does the potential cost to outsource to a scanning service bureau. Finally, documents that are in good shape are better candidates for digitization than those that are ripped, old, fragile, or otherwise in poor shape. Similarly, if you have odd sized or weighted documents, such as cards, multi- part carbon or carbonless forms, oversized engineering drawings, onion skin or card stock, or anything other than standard paper documents, the digitization process will be more difficult and potentially more expensive – specialized equipment and even specially trained staff may be required. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Information Management Repositories There are a variety of different solutions that organizations can use to manage information. These include content services solutions, point solutions, and file sharing solutions. It’s important for organizations to identify their business requirements for information management, and then select the appropriate solution based on those requirements. Content Services Solutions Content services solutions provide significant information management capabilities across a broad variety of business and process needs. Instead of using a variety of different point solutions at every stage in the life cycle, these solutions attempt to provide a single application to do it all. One challenge with these solutions is that they’ve been built through acquisitions, meaning that they may have different code, different look and feel, and different structures between one set of capabilities and another. Another challenge is that because they do so much, they tend to be very complex, and certainly much more complex than either point solutions or the enterprise file sync and share solutions we’ll look at shortly. Finally, comprehensive content services solutions have a lot of capabilities to bring to bear – but they are not all best of breed and in fact may be only adequate or minimally sufficient to the task at hand. Look for solutions that offer scalability and connectivity with other solutions. Core content services capabilities include: n Document management. This includes the ability to check-out documents to ensure they cannot be altered or deleted while being edited by someone else; the ability to check-in documents; and the ability to track versions, either at the time the document is checked in or manually. n Records management. This provides a mechanism to designate a subset of information as records so they can be managed more formally throughout the information lifecycle. Typical capabilities include formal retention according to content-based rules and specific disposition including destruction based on the same rules. We discuss records management in much more detail in another training course module. n Capture/scanning. Almost all content services solutions have a mechanism to capture scanned images and digital files, apply metadata, and manage them throughout the information lifecycle. Where they don’t or where the requirements are more complex, these solutions have the capability to ingest images and metadata from standalone imaging solutions. n Workflow. This provides a capability to take specific actions on documents based on business rules according to the type of information, metadata, and other considerations. This is different from the more comprehensive business process management system (BPMS) approach, which we will discuss elsewhere in this course. n Search. Here we refer to application search within the content repository as well as enterprise search with some solutions. n Collaboration. This refers primarily to document-centric collaboration, perhaps with a very lightweight review and approval workflow. Some vendors also offer process- centric collaboration tools. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Point Solutions Point solutions are tailored to a very specific type of content and business process. These would include, but not be limited to: n Document imaging n Records management n Digital asset management n Email management n Engineering drawing management We’ll look at each of these solutions in a bit more detail. Where there are standalone records management, document management, or document imaging solutions these would certainly apply as well. The issue with these solutions is that very few of them exist because of the propensity of the content services vendors to acquire them as a means of adding to their solution offerings. Document Imaging Document imaging is used to digitize physical documents – mostly paper, but there are some applications that can scan from microfilm as well. Paper documents are scanned in a variety of ways depending on their size, weight, condition, and other business and process considerations. As part of the digitization process, most imaging applications also allow images to be enhanced, such as by removing lines, deskewing or straightening the images, removing holes or extraneous marks, etc. Many imaging applications allow for the provision of metadata for each image. This metadata can be entered manually by the scanner operator or indexer, or it can be recognized and extracted automatically using recognition technologies. Once this step is accomplished, the image is released to a repository, folder, or other storage location for storage or further processing. Multi-channel capture means ingesting different types of business input and expands the scope of imaging from the use of scanners to include other sources and devices, including multi-function devices, web-based portals, fax, and even smart phones and other mobile devices. We address multi-channel capture elsewhere in this course. Paper based inputs can be captured using different input devices like production scanners that are used in centralized scanning operations, distributed scanners including workgroup and personal scanners that are used in a decentralized process, in this type of process multi- function peripherals and smart devices can be used as well. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Records Management Records management solutions are used to capture and manage types of information more formally. By capturing and declaring a particular piece of information as a record, the organization is committing to manage it throughout the lifecycle in a way that preserves its evidentiary weight should it be needed for a legal case, a regulatory inquiry, etc. This means records have to be stored in such a way that they cannot be edited, altered, or deleted for their entire lifecycle. We address records management in much more detail elsewhere in this course. Records management solutions can be used to manage physical records, digital records, or both – though most solutions are better at one or the other. In either case records are to be managed according to their content and their value to the organization, not by format. In other words, whether a record is a piece of paper, a Word document, a PDF, an engineering drawing, or any other type of format, it gets managed the same way, retained for the same period of time, and treated the same way at the end of the information lifecycle. Digital Asset Management Rich media is an increasingly important type of enterprise content, no longer restricted to only marketing departments. Audio (such as podcasts), video, and digital photographs are common types of rich media. Design documents, marketing assets, logos, architectural and engineering documents are all possible document types held in rich media formats. Advanced users may use a dedicated digital asset management (DAM) system. Many information management solution providers have additional modules or extended packages for the digital asset management power-user. Rich media also often has extended metadata to indicate camera types, geographical data, or resolution. Metadata can also reflect any license or copyright restrictions. For example, a digital photo might be licensed from the owner for a one-year campaign, and the rights to use it online expire after that period. An information management or DAM system can help monitor these deadlines and secure content appropriately. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Email Management Email management solutions in the information management context are probably more properly called email archiving solutions. Email archiving is an approach and a solution that attempts to capture, preserve, and often index and manage some or all emails sent or received by and within an organization. This is done for a number of reasons including performance and migration; our focus here is on the information management and governance aspects. The key benefit here is that it happens automatically according to established business rules. That is, users don’t have to select messages to save, and cannot select messages to delete or forget to save: the archive captures all of it. This means that this approach to saving emails is generally less burdensome to users and also more consistent with respect to capturing important messages. Archiving can also include, and be used for, other types of communications objects including text messages, instant messages, social media, and others. Once a message is archived, it is stored and maintained in the email archive until it is dispositioned. Users may or may not have access to archived messages depending on the solution and how it is configured. Engineering Drawing Management The last point solution we’ll look at is engineering drawing management. As the name suggests, this solution is used to manage engineering drawings, which are quite common in architecture, engineering, and construction- focused processes. Engineering drawings today are almost always computer-aided designs, or CAD files, which consist of one or more actual files. These are generally structured as a series of layers; for example, for a building, there might be a layer for ventilation, a layer for plumbing, a layer for electrical, and so forth. Drawings might also include 3-dimensional models as well as external reference files that are reused across various drawings. All of this leads to significant complexity in terms of managing those relationships. In addition, engineering drawings often go through a drafting and review process such that there are versions for the as-designed, intermediary, and as-built entities. As you might imagine, version control becomes a critical issue in the event something happens, and we need to know whether a particular valve was built a particular way – or indeed at all! This much more rigorous version control is more generally known as document control; the specifics of this discipline are outside the scope of this course. Enterprise File Sync and Share Enterprise file sync and share (EFSS) solutions are Cloud-based solutions that allow users to share and synchronize documents easily across multiple devices. They are intended to be lightweight solutions that are easy to use for all employees because they are pretty simple in terms of the interface and in what capabilities they offer – generally some combination of document storage and simple collaboration. Frequently, these tools also make sharing across organizational boundaries easier in terms of providing access to users outside the organization. The challenge with many of these tools today is that they come from the consumer space, meaning they lack many of the features enterprises require and take for granted: centralized access control, robust security, metadata, lifecycle management, etc. They can also be seen as a way to evade existing IT and information governance requirements. Many of the providers in this space are taking both of these concerns to heart and starting to add enterprise-friendly capabilities; some have gone as far as to create connectors with information management systems or other repositories such that the EFSS solution is the content creation/sharing/collaboration front-end, and the repository is used to store the final version of the document/record. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Selecting the Right Solution So what’s the right solution? There is no single answer – it depends on your business requirements. More specifically, comprehensive information management solutions tend to work better for organizations that have more complex requirements and that have the staff to support them: system administrators, technical support, maybe database administrators, etc. On the other hand, for an organization with limited IT capabilities, an enterprise file sync and share solution might be sufficient to meet its needs. If there is a particular requirement not satisfied by either, a point solution can fill that gap. The other consideration is whether the organization wants best of breed capabilities – which suggests point solutions – or the consistency and comprehensiveness of a single solution. It is key to look for the connectivity and scalability of these solutions. Regardless of which approach you take, any repository is better than none at all, such as using file shares or individual PCs to store information. Storing documents in any repository, applying appropriate access controls, and integrating the repository with business applications, BPM solutions, etc. is simply the best way to ensure documents can be retained appropriately and retain their potential evidentiary value. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Virtual Teams and Information Management Virtual Teams – Definition Today, the hybrid work arrangement has emerged as the new standard. Virtual workers and virtual teams can be every bit as productive as their in-office counterparts, if not more, if they are set up properly to succeed. Virtual teams may be in the same geographic area and time zone or may span the globe. That raises a number of issues, some of which fall under the guise of information management. Having workers, or customers for that matter, in different time zones means that scheduling calls and collaboration meetings is even more difficult than usual – who wants to be on a call at 11 pm or 2 am? What if there is a bank or federal or religious holiday the team isn’t aware of? Collaboration Tools Virtual teams are successful when they have the necessary tools available to do the work that is required. This means access to collaboration tools; different tools have different use cases and approach collaboration differently. Some tools are synchronous, meaning that everyone has to be available to use them at the same time. These would include tools like instant messaging, text messaging, web conferencing, and the like. These are great for co-authoring, reviewing, holding meetings, etc. but they do require everyone to be on – consider the meeting that includes team members in Australia, Germany, and the U.S. – there’s just no time that works well for everyone. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Other tools are asynchronous, meaning they are designed to be used by different people on different schedules. Email is probably the most well-known of these; despite what many staff (and managers!) might think, emails do not generally need to be answered within 10 seconds of receiving them. And as we’ve seen, email has its own challenges for collaboration. Other tools here would include document management, enterprise file sync and share, and any collaborative tool that includes some sort of workspace and storage. And some tools offer both synchronous and asynchronous capabilities that are used depending on what’s needed. Collaboration Across Organizational Boundaries This module will help you to Identify the issues associated with sharing content across internal and external organizational boundaries, i.e., between departments or with customers. Collaboration Issues In addition to the usual challenges posed in collaborating effectively, there are a number of issues that are either unique to, or exacerbated by, collaborating across departmental and organizational boundaries, including: n Culture n Security n Accessibility n Time and geography n Language barriers n Now let’s look at each of these issues in more detail. Cultural Issues Cultural issues are always present in a collaborative environment, because of the clash between sharing and hoarding. This can come down to the individual contributor level – if you’re a knowledge hoarder, it may be difficult to get you to share and participate in the process. Cultural issues also exist when an information worker does not want to change the status quo or has concerns about losing control. One level up, this often manifests in an unwillingness of specific departments to share and participate. It is not uncommon for departments to jealously guard “their” information and keep it from prying eyes elsewhere in the organization. The information worker may not want to change the status quo or may be afraid they will lose control. Cross-functional collaboration will help an organization to be effective than the siloed approach. And at the interorganizational level, culture clashes can impact collaboration too. For example, in many jurisdictions digital signatures are every bit as legal as wet-ink ones and there are significant arguments to be made that they are even more robust and defensible than wet-ink signatures. But that doesn’t matter if your company or agency or country doesn’t recognize digital signatures as valid. This could be a legal issue as well, of course. But this is just one example of how these issues could prevent effective collaboration between organizations. It is important to note that legal requirements continue to change so it is important to conduct regular reviews and validate the latest legislation that supports digital documents including digital signatures. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Access Controls The next challenge has to do with access control levels. This is a common issue inside organizations and to some extent may relate to the information hoarding we just discussed. This can be addressed through cross- functional teams and steering groups that can determine appropriate levels of security. When outside users or organizations are involved, this becomes significantly more challenging. In most cases, your IT or security team will not want to simply grant network access to outsiders. The preferred approach will generally involve using some sort of collaboration tool that is designed for cross- organizational collaboration. Some advanced systems will allow the organization to limit access to very specific sections of the system or even to specific documents. One quick note about email. Email makes it very easy to collaborate with external users. We discuss elsewhere in this course the significant challenges email presents to effective collaboration, but here’s one more consideration: once you email something to an external party, you no longer have control over what they do with it or the attachments contained in the email. They can save it indefinitely, forward it to anyone they want, edit it however they want, etc. This may be OK depending on the nature of the document and the expected outcome of the collaborative process, but information professionals should be aware of this. Depending on the email system, file types, edits, and the ability to forward can be limited. Accessibility and Findability There certainly could be accessibility and findability issues between different organizations and even different departments, especially for larger, more complex organizations. While most organizations have standardized on market leaders like Microsoft Office and Adobe Acrobat, not all have, and different versions can present compatibility issues as well. Metadata structures are often different between different systems in the same organization, much less between multiple organizations. Similarly, different departments will approach classification structures and folders differently, as will different organizations. This is why many more mature industries have developed standards and standard taxonomies – though this approach has issues as well. We discuss these issues in more detail elsewhere in this course. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Time and Geographic Issues Any time geographically dispersed groups need to collaborate, synchronous collaboration will be an issue. Consider the challenges of setting up a web conference between offices in Los Angeles, CA; London, UK; and Sydney, Australia. Someone is getting up in the middle of the night for that call. But there are other geographic issues to consider as well. n What’s the work culture like in each location – in terms of scheduling, the length of the workday, vacation days, even holidays celebrated here but not there. n In the age of the General Data Protection Regulation (GDPR) and other privacy and data protection- related regulations, it is key to be aware of the different regulations for safeguarding information and not retaining personal data of participants. n There could be specific legal requirements to consider around intellectual property, appropriate contracting vehicles, and terms, etc. We mentioned digital signatures earlier as another specific example of this. Language Barriers It may seem obvious that there could be language barriers to consider, but it’s more than just actual languages (and their potential impact on legal terms and conditions, etc.). It’s also things like departmental- or organization-specific acronyms, abbreviations, slang terms, or simply different terms and/or spellings – for example - center (American English) vs. centre (most of the rest of the English- speaking world). And it’s not unusual for different organizations and even different departments within an organization to see a particular concept differently. Engineering and sales and HR will look at a particular resource or process quite differently. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Collaboration and Information Management Legacy Collaboration Issues The Problem(s) with Email Like it or not, email continues to be the primary communication tool within organizations and with external stakeholders. We use it to communicate with our bosses, colleagues, partners, and customers. We use it for storing important messages, and a lot of important collaboration happens in email. That said, email presents significant problems to the organization. Below are six reasons for replacing email with new and better collaboration tools: 1. Email turns collaboration into information chaos. Email is a really good communication and notification tool, but we quickly end up with chaos when multiple people try to use email for discussing a topic or for developing something. We get reply one vs all and reply first message vs later messages. And email volume only makes this worse. Consider: according to research, the average employee sends or receives more than 100 email messages per day. This means that that all-important message can quickly get lost in the deluge. 2. Email locks down information and knowledge. Gallup has found a correlation between level of employee engagement and customer service by 10%, productivity by 21%, and profitability by 22%. Email hinders this because it creates knowledge silos – one silo per mailbox! That means important information and knowledge gets locked down or lost in personal and corporate email boxes. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: 3. Email distracts knowledge workers. New incoming emails tend to distract us. We end up in a responsive mode instead of spending our time being strategic and creative. We used to have a “You got mail” audio alert in the old days, - it’s now best to turn off all email notifications to be productive. 4. Email lacks information filters. “I don’t have an information overload problem, I have filter failure,” says Clay Shirky, futurist and author. Most of us have only spam filters for our emails, - we don’t know if the rest of the emails are important or not until we look at them. Most email providers have tried to fix this by adding a filter for important emails, but this feels quite basic. 5. Email makes it difficult to share large files. One of the reasons so many enterprise file sharing solutions like Box and Dropbox have become so popular is that organizations still have a need to share large files both inside and outside the firewall, yet in many organizations, attachment sizes are arbitrarily limited to, say, 5 megabytes. This means that the tool is getting in the way of doing the business of the organization, and whenever that happens, users WILL figure out a way to work around it unless email links to the repositories, collaboration tools, etc. 6. Email leads to attachment spam. Finally, even if you can share files as attachments, this creates its own problem as people send different versions back and forth, some with changes in the document, some with changes listed in the email, etc. etc. Version control is a huge issue in using email to collaborate. The Problem with Paper Similarly, paper files present significant issues to collaborative processes. First, paper is time-consuming to access. Even if it is perfectly filed in logical filing systems, someone has to get up, go to the paper files, and retrieve the file in question. And if the unthinkable happens and a paper file is misfiled, it might as well have been shredded because it will be extremely difficult to find again later. It’s also difficult to share. If I have the file, you can’t really look at it at the same time. And if you’re in another office or a remote worker, it becomes even more difficult (and time-consuming!). Paper can be difficult to track – most schemes rely on users or document control staff or records managers to sign documents out manually. Paper documents can take up a lot of space. Whether onsite at your facilities, or at an offsite storage provider, there is a real cost associated with maintaining paper documents. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Likewise, paper cannot provide as much security as electronic documents. You can’t encrypt paper documents or apply digital rights management to them; if you have physical access to the document storage location, you can do whatever you want with or to them. And paper offers very poor disaster recovery. Every incident and natural disaster results at some point in pictures of a sea of paper swirling or floating or otherwise being lost or destroyed. Because of the bulk and cost associated with managing paper documents operationally, including the cost and time required to make copies, most organizations identify their most critical stuff as “vital records,” which are then stored in very safe, but very expensive storage: fireproof vaults, other offices, or offsite storage. The Problem with Digital Landfills This brings us to digital documents, which arguably address many of the issues we identified in the previous two sections. However, digital documents present their own issues, most notably: n The volume is significantly higher compared to paper documents, and n In the absence of a formal information management system, these documents are stored on network shared drives or perhaps an ungoverned SharePoint implementation, with the result that those systems come to resemble “digital landfills” overflowing with versions, outdated content, personal content, and all manner of other stuff. This means that even though we have digital documents, we revert to information being difficult to find and not being able to find the right version or confirm what happened to a particular document. If the digital documents have little to no metadata, it is as difficult to find the digital documents as trying to find a paper document in a filing cabinet. The net result is that it is increasingly difficult to trust that a particular document or set of documents is accurate, trustworthy, and reliable. The other issue with these solutions is that they tend to focus on internal collaborative processes rather than external. If you have a business need to collaborate with suppliers, customers, partners, or outside parties, they can be difficult or even impossible to do from a practical information management and information security perspective. What’s the answer? As we’ll see shortly, the answer is using an effective collaboration tool that enables more efficient collaborative processes while still ensuring appropriate levels of governance and security. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Document-centric Collaboration This module will help you to identify the key features required for effective document-centric collaboration, such as version control, workflow, and access controls. Why Collaborate? Many of the core activities related to the lifecycle of business information are collaborative by nature. Creating, revising, and reviewing content are tasks that often involve two or more people. Content that has been drafted, saved into an information management system, and organized with metadata often will still require revision cycles, proofreading and approvals before it is considered complete. An information management system provides an efficient, solid foundation for such content sharing, co-creation, or review activities. Document-centric Collaboration Document-centric collaboration is a combination of technologies, usually for asynchronous collaboration though some provide synchronous capabilities as well. This includes workspaces for collaboration on documents; mark-up and annotations; and version control. These tools can be quite robust and manage a complex draft and review process or be very simple and get the tool out of the way in order to improve the collaborative process. Important Features Here is a more detailed listing of capabilities to look for in a document centric collaboration solution. These features would generally be found in document management solutions or in the document management capabilities of an information management solution. How to collaborate on documents. Here we’re looking for the ability to create a new document or open an existing one and begin the collaborative process. Necessary features here would include: n Commenting – the ability for multiple users to add comments to a document in draft and to have those comments available to others for review. n Version control – a mechanism to ensure that when changes are made or a new version is uploaded, there is a record of the change history including who made the changes. n Workflow – business rules that help to ensure that documents are complete, all necessary reviews have been completed, and the document has been formally approved. n Access controls – makes it so that only authorized users can make changes. n Audit trail – tracks all changes made by a particular user or to a particular document. And notifications and alerts – for things like changes in status, requests for review, etc. Finally, a few solutions offer co-authoring capabilities. This allows users to work in the same document simultaneously. Users can make edits and see others’ edits as they are being made; some solutions also build in the ability to chat or send notifications in real time. Examples here include Google Docs and Microsoft SharePoint. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Use Case: Improving Meetings Throughout this module we’ve reviewed the ways in which collaboration and information management can support better work practices and enable improved sharing and communication. Let’s take a moment to apply some of these concepts, and step through how collaborative approaches can help streamline one of the most common business activities – the routine team meeting. We’ll step through an example of how to introduce quicker, simpler collaboration methods and remove some of the email burden from routine processes. Think about typical processes today: call participants dial in and designate someone to take notes on a local laptop. Notes are emailed to the team hours or days after the call. Email threads make follow up tracking difficult, especially when the group grows over time and new roles are introduced. It can be hard to include new team members when the background of a project is stored in dozens of email threads. The convoluted thread of replies, forwards and cc’s can be overwhelming for new team members. Information to support meetings is usually copied as email attachments, which begs the question: who has the right version weeks later? Now let’s consider how the same weekly team call could be more effective by incorporating more collaborative information management tools. The team could establish a simple online wiki page allowing it to be edited in real-time during the call, noting all decisions, action items and updates. The follow up activities can be recorded in one place to be updated over the course of the next work week. Plans, schedules, or other documents can be stored in an information management system with version control and access controls. This helps to reduce confusion, track progress, and show which users have contributed to ongoing updates. Links to these managed documents ensure up-to-date versions are always available from one central online location. As the project progresses over weeks or months, new project members can quickly read the background and history in one place to get up to speed. The potential for productivity gains is high, saving time, reducing confusion, greatly reducing volume of email and copied documents, and minimizing time to productivity for new participants. Collaboration and Governance This module will help you to determine whether and how to apply governance to collaboration environments and artifacts. Collaboration and Governance So far, we’ve been focused on the creative side of collaboration and how to make it useful. But we can’t leave this topic without some discussion of governance. Effective governance supports effective collaboration by providing some guidance and boundaries. For example, in a complex authoring environment, governance would include the review and approval steps, as well as ensuring that all the parts that make up the final deliverable are present, the correct version, etc. Governance also helps to ensure that the final output of the collaborative process is trustworthy and reliable. We know who contributed, who approved it, and that it hasn’t been changed since that approval. So, governance does need to be applied to and throughout the collaborative process. The question then becomes how much, and how to balance the need for governance with the need for effective collaboration. If the process is locked down too tightly, the collaborative process will suffer. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Governance and the Platform The first thing to consider is how the collaborative platform or environment supports governance. The most obvious element is access controls – that is, who can access the environment, who can access a particular project or document or site, and what can they do when they get there? Next, we can look for document management functionality: check-in/check-out and version control, so that changes can be tracked and rolled back if necessary. We can implement business rules and workflows to guide the flow of the collaborative process, all the way through review, approval, and final publication. And there’s an element that is often overlooked: What do you do with the collaborative environment once the work is complete? All the drafts, all the supporting documentation, etc. all has to be dealt with in accordance with the broader information governance framework. This needs to include all the other artifacts of the process – email message threads, chats, recorded web conferencing sessions, and the like. Roles and Responsibilities We also want to consider the governance framework and how it can support the process. There are a couple of main areas to consider. Roles and responsibilities go hand in hand with the access controls we discussed earlier, but it includes roles like: n Project manager, authoring lead, or whoever is in charge of a particular collaborative process n IT – providing, administering, and supporting the system n Legal – to address any legal considerations, as well as any potential requests for information about the process n Records management – to ensure that any artifacts of the process are managed appropriately according to the retention schedule n Privacy / data protection, where the output of the collaboration includes that type of data Policies and Procedures Policies and associated processes and procedures should be developed at the platform level, to address collaboration as a whole, and for individual collaborative processes. Here are some things to address: n How collaborative environments are provisioned and made available: What does the request process look like? Are any approvals required? Or can users simply create a space, invite some collaborators, and start using it? n What are the rules around external collaborators – are there certain processes or documents that external collaborators should not have access to for some reason? n Are there any formal review and approval requirements? Even if not, a quality control and tacit approval process should be included. n What happens at the end of the collaborative process? How does the team know when it’s done, and what happens to the space and the artifacts within it? © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Metadata and Findability And here are some considerations for the metadata associated with a collaborative process. Advanced AI tools offer to search for specific documents, i.e., the latest contract with supplier X, without any metadata. n Naming conventions. This goes to how collaborative environments are named so they can be found by interested or appropriate collaborators. This might also apply to final deliverables. Naming conventions should be concise and describe the environment or deliverable accurately. n Metadata more broadly. There is a lot of metadata generated during the collaborative process, ranging from the naming conventions above to date created, date last accessed, or date deleted, to many others. Good metadata improves the findability of the environment, any deliverables, and any supporting information. It also provides traceability to understand who participated, how they contributed, who approved the ultimate deliverable, etc. This means that there needs to be governance around who can change metadata in the collaborative environment. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Digital Preservation Digital Preservation Risk Factors Almost all physical records require little or no technology to allow humans to extract the information from them. You can pick up an old record, and whether it is written or printed on clay, papyrus, paper, or anything else, you can make out what it says. There are exceptions of course – microfilm, audio, and video recordings all need some sort of technological assistance, but these are all relatively recent developments, and are in the minority. By contrast, digital records need technology to allow humans to understand them. This includes a lot of different elements: n Servers n Disk drives n Network n PC n Screen n Operating system n ERM system software n And so on The other issue is that software and hardware evolve rapidly, and compatibility issues can start to occur in as little as a single upgrade. © AIIM aiim.org/CIP The AIIM Official CIP STUDY GUIDE Domain 1: Digital Preservation Issues You may by now have realized that digital preservation can be broken down into three key issues: n Storage media obsolescence n Media degradation n Format obsolescence Now let’s look at each in a little more depth. Storage Media Obsolescence The first problem we’ll consider is storage media obsolescence. This refers to the simple truth that storage media, and the devices to read them, tend to fall out of fashion quite quickly. Here is a selection of obsolete computer storage media: n A short length of punched paper tape n A single 80-column punched card n A selection of 5¼ inch floppy disks n Two different formats of tape cartridges n An 8-inch floppy disk, the first widely used floppy disk format n A 3½ inch, 128-megabyte rewritable magneto-optical disk So, for each of these, and many others not shown, the key question is where would you find the hardware to read that media today? And it’s not just the 5¼ floppy drive itself, for example, but also a suitable power supply, the software dr