Tuesday, December 29, 2015
Many people are highly dependent of their mobile devices for every day interactions, including mobile commerce. Our society is becoming highly mobile and connected. In the latest Shop.org and Forrester Research Mobile Commerce Survey, it's estimated that U.S. smartphone commerce will grow to $31 billion by 2016.
Those organizations that can best serve mobile customers will have an advantage in the competition. With a surge in mobile traffic comes the added potential to connect with and sell to customers through mobile commerce. Having a concrete mobile infrastructure plan and strategy is no longer an option, as it had been in recent years, but rather a must to compete in any customer-facing situation.
But despite this upward trajectory, retailers and other consumer-oriented companies still express some hesitancy about investing in multi-device environments. There is still some apprehension by companies, when it comes to moving forward with mobile planning. Companies still struggle to maintain uniformity across multiple device experiences when there are various screen sizes, operating systems, hardware specifications, and loading speeds to consider. One fear is that of the unknown, but security, data management, and simply proving a use case and subsequent return on investment are concerns as well.
The key issue in smartphone shopping continues to be the form factor, which can make navigation more difficult for customers. In addition to slower page load times on smartphones, some customers are concerned about the security of the transaction or simply complain that the experience just is not the same.
A successful mobile experience, like many other customer experiences, is about fulfilling customers' needs. First-time users of a mobile site or app tend to be less satisfied with their mobile experiences than frequent users because of their lack of familiarity with layouts, navigation, and functionality according to the survey of the mobile users. Knowing the different kinds of mobile devices customers use is critical. It is pertinent to develop a strategy that encompasses all types of customer scenarios.
Before embarking on any one mobile strategy, it is important to learn how your company's customers most likely would use their mobile devices. In addition to enabling customers to interact how they wish, any company looking to optimize its mobile presence must naturally consider the effects on the business as well, and how mobile usage will impact other lines of business and cross-channel marketing efforts.
In addition to justifying a use case and ROI for mobile, companies that wish to get into the mobile side of business must be aware of its limitations. Under ideal circumstances, companies want to engage with their customers and cultivate a one-to-one relationship while taking into consideration CANSPAM and privacy regulations. It is very important to adjust taxonomy and information architecture for the mobile experience. A lot of searches are made using mobile devices, so search also has to be optimized.
Optimizing your mobile site or developing a native application is no simple task. There are security considerations, as well as device-specific functions, to consider. Don't take a cookie-cutter approach. Some companies make the mistake of simply cloning online information without considering that consumer behavior on the mobile phone is dramatically different. Justify mobile ROI with consumer insight.
Consider security. Create a military-grade security infrastructure, while maintaining user-friendly design. Hire the best user interaction designer to design the security setup interaction.
Utilize mobile wisely. Once someone has discovered your brand through search, referral, or a marketing message, and they download the app, this may indicate a loyal customer. The app can be a great way to maximize and monetize that loyal relationship because it's in a controlled environment.
Galaxy Consulting has experience optimizing information architecture and search for mobile devices. Contact us today for a free consultation.
Monday, December 7, 2015
A data lake is a large storage repository and processing engine. Data lakes focus on storing disparate data and ignore how or why data is used, governed, defined and secured.
The data lake concept hopes to solve information silos. Rather than having dozens of independently managed collections of data, you can combine these sources in the unmanaged data lake. The consolidation theoretically results in increased information use and sharing, while cutting costs through server and license reduction.
Data lakes can help resolve the nagging problem of accessibility and data integration. Using big data infrastructures, enterprises are starting to pull together increasing data volumes for analytics or simply to store for undetermined future use. Enterprises that must use enormous volumes and myriad varieties of data to respond to regulatory and competitive pressures are adopting data lakes. Data lakes are an emerging and powerful approach to the challenges of data integration as enterprises increase their exposure to mobile and cloud-based applications, the sensor-driven Internet of Things, and other aspects.
Currently the only viable example of a data lake is Apache Hadoop. Many companies also use cloud storage services such as Amazon S3 along with other open source tools such as Docker as a data lake. There is a gradual academic interest in the concept of data lakes.
Previous approaches to broad-based data integration have forced all users into a common predetermined schema, or data model. Unlike this monolithic view of a single enterprise-wide data model, the data lake relaxes standardization and defers modeling, resulting in a nearly unlimited potential for operational insight and data discovery. As data volumes, data variety, and metadata richness grow, so does the benefit.
Data lake is helping companies to collaboratively create models or views of the data and then manage incremental improvements to the metadata. Data scientists and business analysts using the newest lineage tracking tools such as Revelytix Loom or Apache Falcon to follow each other’s purpose-built data schemas. The lineage tracking metadata also is placed in the Hadoop Distributed File System (HDFS) which stores pieces of files across a distributed cluster of servers in the cloud where the metadata is accessible and can be collaboratively refined. Analytics drawn from the data lake become increasingly valuable as the metadata describing different views of the data accumulates.
Every industry has a potential data lake use case. A data lake can be a way to gain more visibility or to put an end to data silos. Many companies see data lakes as an opportunity to capture a 360-degree view of their customers or to analyze social media trends.
Some companies have built big data sandboxes for analysis by data scientists. Such sandboxes are somewhat similar to data lakes, albeit narrower in scope and purpose.
Relational data warehouses and their big price tags have long dominated complex analytics, reporting, and operations. However, their slow-changing data models and rigid field-to-field integration mappings are too brittle to support big data volume and variety. The vast majority of these systems also leave business users dependent on IT for even the smallest enhancements, due mostly to inelastic design, unmanageable system complexity, and low system tolerance for human error. The data lake approach helps to solve these problems.
Step number one in a data lake project is to pull all data together into one repository while giving minimal attention to creating schemas that define integration points between disparate data sets. This approach facilitates access, but the work required to turn that data into actionable insights is a substantial challenge. While integrating the data takes place at the Hadoop layer, contextualizing the metadata takes place at schema creation time.
Integrating data involves fewer steps because data lakes don’t enforce a rigid metadata schema as do relational data warehouses. Instead, data lakes support a concept known as late binding, or schema on read, in which users build custom schema into their queries. Data is bound to a dynamic schema created upon query execution. The late-binding principle shifts the data modeling from centralized data warehousing teams and database administrators, who are often remote from data sources, to localized teams of business analysts and data scientists, who can help create flexible, domain-specific context. For those accustomed to SQL, this shift opens a whole new world.
Some data lake initiatives have not succeeded, producing instead more silos or empty sandboxes. Given the risk, everyone is proceeding cautiously. There are companies who create big data graveyards, dumping everything into them and hoping to do something with it down the road.
Companies would avoid creating big data graveyards by developing and executing a solid strategic plan that applies the right technology and methods to the problem. Hadoop and the NoSQL (Not only SQL) category of databases have potential, especially when they can enable a single enterprise-wide repository and provide access to data previously trapped in silos. The main challenge is not creating a data lake, but taking advantage of the opportunities it presents. A means of creating, enriching, and managing semantic metadata incrementally is essential.
Data Flow in the Data Lake
The data lake loads extracts, irrespective of its format, into a big data store. Metadata is decoupled from its underlying data and stored independently. This enables flexibility for multiple end-user perspectives and maturing semantics.
How a Data Lake Matures
Sourcing new data into the lake can occur gradually and will not impact existing models. The lake starts with raw data, and it matures as more data flows in, as users and machines build up metadata, and as user adoption broadens. Ambiguous and competing terms eventually converge into a shared understanding (that is, semantics) within and across business domains. Data maturity results as a natural outgrowth of the ongoing user interaction and feedback at the metadata management layer, interaction that continually refines the lake and enhances discovery.
With the data lake, users can take what is relevant and leave the rest. Individual business domains can mature independently and gradually. Perfect data classification is not required. Users throughout the enterprise can see across all disciplines, not limited by organizational silos or rigid schema.
Data Lake Maturity
The data lake foundation includes a big data repository, metadata management, and an application framework to capture and contextualize end-user feedback. The increasing value of analytics is then directly correlated in increase in user adoption across the enterprise.
Data lakes therefore carry risks. The most important is the inability to determine data quality or the lineage of findings by other analysts or users that have found value, previously, in using the same data in the lake. By its definition, a data lake accepts any data, without oversight or governance. Without descriptive metadata and a mechanism to maintain it, the data lake risks turning into a data swamp. And without metadata, every subsequent use of data means analysts start from scratch.
Another risk is security and access control. Data can be placed into the data lake with no oversight of the contents. Many data lakes are being used for data whose privacy and regulatory requirements are likely to represent risk exposure. The security capabilities of central data lake technologies are still in the beginning stage.
Finally, performance aspects should not be overlooked. Tools and data interfaces simply cannot perform at the same level against a general-purpose store as they can against optimized and purpose-built infrastructure.
Careful planning and organization of data lake strategy is required to make this project a success.
Monday, November 23, 2015
Microsoft releases a new version of SharePoint every three years. SharePoint 2016 public Beta version is available. The full version is expected in Spring 2016. Here is what is new in SharePoint 2016 version.
SharePoint 2016’s main goal is to bring the best of Office 365 Cloud technology to on-premises solutions. In this truly effective Hybrid model, organizations will be able to have the best of the Cloud, whilst keeping all their important information and data stored on-premises.
SharePoint Server 2016 has been designed to reduce the emphasis on IT and streamline administrative tasks, so that IT professionals can concentrate on core competencies and mitigate costs. Tasks that may have taken hours to complete in the past have become simple and efficient processes that allow IT to focus less on day-to-day management and more on innovation.
- Mobile experiences
- Personalized insights
- People-centric file storage and collaboration
- Improved performance and reliability
- Hybrid cloud with global reach
- Support and monitoring tools
- New data protection and monitoring tools
- Improved reporting and analytics
- Trusted platform
You can now install just the role that you want on particular SharePoint 2016 servers. This will only install what’s required there, and it will make sure that all servers that belong to each role are compliant. You will also be able to convert servers to run new roles if needed. You can look at the services running on the SharePoint 2016 server and see if they are compliant.
Downtime for Updates
Downtime previously required to update SharePoint servers has been removed.
Mobile and touch
Making decisions faster and keeping in contact are critical capabilities for increasing effectiveness in any organization. The ability for end users to access information while on the go is now a workplace necessity. In addition to a consistent cross-screen experience, SharePoint Server 2016 provides the latest technologies and standards for mobile push and information synchronization. With deep investment in HTML5, SharePoint 2016 provides capabilities that enable device-specific targeting of content. This helps to ensure that users have access to the information they need, regardless of the screen they choose to access it on.
SharePoint 2016 further empowers users by delivering a consistent experience across screens, whether using a browser on the desktop or a mobile device. Through this rich experience, users can easily transition from one client to another without having to sacrifice features.
The App Launcher provides a new navigation experience where all your apps are easily available from the top navigation bar. You can quickly launch your application, browse sites and access your personal files.
Based on SharePoint Online and OneDrive for Business, SharePoint 2016 document libraries inherit the improved control surface for working with content, simplifying the user experience for content creation, sharing and management.
SharePoint 2016 improves the sharing experience by making it more natural for users to share sites and files. You can just click the "Share" button at the top right corner of every page, enter the names of people you want to share with, and press Enter. The people you just shared with will get an email invitation with a link to the site.
SharePoint still uses powerful concepts like permission levels, groups and inheritance to provide this experience. Part of sharing is also understanding who can see something. If you want to find out who already has access to a particular site, you can go to the "Settings" menu in top right corner, click "Shared with", and you will see the names and pictures of people who have access to the site.
Large File Support
SharePoint 2016 provides support for uploading files up to 10GB.
Preventing data loss is non-negotiable, and over-exposure to information can have legal and compliance implications. SharePoint 2016 provides a broad array of features and capabilities designed to make certain that sensitive information remains that way, and to ensure that the right people have access to the right information at the right time.
New In-Place Hold Policy and Document Deletion Centers will allow you to manage time-based, organization-wide in-place hold policies to preserve items in SharePoint and OneDrive for Business for a fixed period of time, in addition to managing policies that can delete documents after a specified period of time.
Cloud Hybrid Search
Cloud hybrid search offers users the ability to seamlessly discover relevant information across on-premises and Office 365 content. With the cloud hybrid search solution, you index all your crawled content, including on-premises content, in your search index in Office 365. When users query your search index in Office 365, they get unified search results from both on-premises and Office 365 cloud services with combined search relevancy ranking.
Cloud hybrid search provides some key benefits to customers of both SharePoint 2013 and early adopters of SharePoint 2016 IT preview, such as:
- the ability to reduce your on-premises search footprint;
- the option to crawl in-market and legacy versions of SharePoint, such as 2007, 2010 and 2013, without requiring upgrade of those versions;
- avoiding the cost of sustaining large indexes, as it is hosted in Office 365.
With this new hybrid configuration, this same experience will also allow users to leverage the power of Office Graph to discover relevant information in Delve, regardless of where information is stored. You will not only be able to get back to all the content you need via Delve, but also discover new information in the new Delve profile experiences and even have the ability to organize content in Boards for easy sharing and access.
You will have to use the Office 365 Search for this to work. If SharePoint 2016 On-Premises users query against their On-Premises Search service, it will continue to give them local results only.
However, once available, this will allow users to fully embrace experiences like Delve in Office 365 and more to come in the future.
With SharePoint 2016, you can redirect your My Sites to your Office 365 subscription’s OneDrive for Business host. In other words, if a user clicks on OneDrive, he will be redirected to his Office 365 My Site and no longer to his On-Premises. Although you can use document libraries in on-premises SharePoint, Microsoft's larger strategy pushes users to use OneDrive to manage files across all devices. This creates the ability to integrate that OneDrive cloud storage into your on-premises SharePoint.
Now users can click on “Follow” both On-Premises and on their Office 365 and see them all in one place under the “Sites” app in the App Launcher.
The OneDrive for Business area aims to bring users to one place to help them work with their files regardless of where they are. You will also be able to navigate your Sites and their libraries from there.
Saturday, November 7, 2015
Vasont is a component content management system. It has powerful capabilities to store, update, search, and retrieve content. It offers version control, integrated workflows, project management, collaborative review, translation management, and reporting to manage content and business processes.
Vasont provides opportunities for multi-channel publishing and editing in your favorite applications. In addition, it provides an advanced editorial environment built to maximize, manage, and measure content reuse. Unicode support enables multi-language implementations. It also integrates the ability to process content with reusable, event driven business logic as an integral part of the system.
Content is stored in an underlying Oracle database and can be imported, exported, and stored in a variety of formats, including XML, SGML, HTML, as well as other formats that are required as input documents or deliverable formats. This is possible because Vasont can store content separately from any specific tagging structure.
Vasont can be used to store and manage embedded multimedia in structured content. It can also be used to provide a consistent organization and hierarchy to unstructured business documents and other digital assets to provide an overall document management solution. Vasont stores both component-level graphics and unstructured business documents as multimedia components.
Content can be stored at a document or sub-document level and with any content assets such as graphics and references. Vasont has great power at the component level with content organized using XML as input and output. Content can be manipulated and reused at any level of granularity. It is easy to add metadata to existing content and take advantage of the richness that metadata can provide.
Vasont also excels at integrating XML and non-XML traditional document content to provide powerful content applications that can cross departmental or functional boundaries. It is effective in a variety of content scenarios or in combined scenarios, including:
- highly structured XML or SGML content;
- structuring unstructured information assets such as in regulatory environments;
- documents, especially linked to workflow and business logic;
- digital assets such as graphics.
Vasont allows the building of content within and among these content relationships and content scenarios. It provides the power to model information in an organization and share it across different divisions. It stores all types of content in one repository. For example, structured content (i.e. XML, HTML, SGML, text and pointers), multimedia files, unstructured documents (i.e., Word, Excel, PDF files, graphics).
In Vasont Administrator, an administrator can set up the rules of structure and apply any processing options needed to transform, validate, or redirect data. The administrator can store settings for loading, extracting, editing and viewing data; user permissions; and workflow. Administrative responsibility can be assigned to specific Associate Administrators so that multiple groups or departments can share the system and yet control their own setups.
The system includes Vasont Universal Integrator (VUI) for Arbortext Editor, Adobe FrameMaker, JustSystems XMetaL, Quark XML Author, or Microsoft Word. The VUI allows authors to work in a familiar environment and provides a frequently used subset of functionality available in Vasont to simplify the editing process.
Vasont High-Level Application Architecture
Main parts are User Navigator and Content Navigator. Users, their roles and permissions are set up using User Navigator. Content Navigator includes content definitions, content instances, workflow definitions, load and extract views, and business logic which is processing options.
There is Vasont Application Programming Interface (API) for advanced customization and integration. The Vasont API allows for development of:
- custom user interfaces;
- web access to Vasont;
- processing options;
Vasont Daemon Programs provides background processing routines that automate repetitive tasks such as extracting and loading content. Some customization is required to implement it.
The content model and the corresponding rules of structure are defined by the administrator in the Vasont Administrator. These rules usually correspond closely to the structure rules defined in a Document Type Definition (DTD) or schema, but they may differ somewhat or may support multiple DTDs for different outputs. Structures may also be defined in Vasont, independent of a DTD, which is useful when storing documents and other digital assets that may need to be organized in a specific way but are not structured XML or SGML content. The rules of structure help guide you through the editing process by allowing you to place components in only the appropriate locations in a collection.
The Vasont Administrator is also used to define the big picture of how collections will be organized in Vasont, through the creation of content types and collection groups. These categories are represented in a tree or list view in Vasont and have symbols that represent them. This screen of a tree view shows the sequencing and grouping of collections.
The detailed items in a collection are called components. The top component in each tree view is called the primary. Normally a collection will contain many primaries.
Vasont has several classes of components and components can be broken down into smaller chunks, depending on the needs of the organization. The level of chunking is called granularity. It is essential to understand how your Vasont system has been configured so that you can find and edit the relevant material and maximize reuse. Granularity describes the smallest chunk of content stored in Vasont. A high level of granularity means that content is stored in large chunks. For example, you may have Book, Chapter, and Section components with no components defined at a level lower than Section. On the other hand, a very granular setup stores content in very small chunks, typically broken down into paragraph-level components or the equivalent.
Content types are the highest level of organization in Vasont and often serve as major divisions in content. Typically, different content types store content with very different content models, such as content used in different divisions or groups within a corporation. Content types are set up in the Vasont Administrator.
Content in each content type is organized into collections and optional collections groups. Inside of a content type called Publication, a collection such as Manuals is a grouping of similar content that follows the same structure. Depending on how similar the content model is, collections and collection groups within a single content type may share content. Collections in the same content type have similar content models so that content can be reused, moved, and referenced. Content in collections from different content types may be reused if the content types share similar raw components. Pointers are allowed from components in one collection to components in another collection and the collections can be in different content types.
Components are reusable chunks of content defined in the rules of structure for each collection. Although not required to, components usually correspond to elements in a document type definition (DTD). The three types of components are: text, multimedia, and pointer.
Metadata, or information about your content, helps you automate business logic and categorize, locate, filter, and extract content. Traditional types of metadata for topics include index entries that describe content or identifiers that can be used for cross-referencing or mapping context-sensitive help in software applications. Other examples of metadata include labeling content that applies to a particular customer or vendor, whether content should be published to an online help system or a printed manual, or other types of classifications. Metadata can be information that helps perform automated business logic through the use of Vasont Processing Options.
The Vasont Navigator provides an intuitive way to view, edit, reuse, and search content within a collection. Its hierarchical structure represents the organization of content in the system and icons indicate the state of items, including whether they have been included in a log. Components may be opened and closed individually or in groups. Open multiple Navigator windows to drag and drop content easily from one location to another, either within or across collections, rather than scrolling up and down the tree view.
Vasont provides powerful search capabilities to find and reuse content across the entire organization. The search function allows to search for content across collection boundaries. When performing a cross-collection search, you are prompted to select the collections to search and then specify query criteria for the content desired.
The Vasont Content Ownership feature gives a designated user the right to assign ownership to an individual user, or a group of users which provides the exclusive right to alter specified content. The designated user will have the right to assign ownership to a Primary component. Once ownership is assigned, the Vasont CMS then recognizes users who have permission to perform add/delete/change actions to the content, and prevents those who do not have ownership permissions from making changes to the content.
Each and every piece of unique content is stored in the raw material only once. Vasont compares content in the same raw component or in aliased raw components to determine if the content has been used in more than one instance. If the text of the components is the same, it is stored in the raw material as a single component. Vasont's ability to automatically reuse content where it can, without any specific setup, is called implicit reuse.
Depending on your setup, you may explicitly reuse content by referencing or “pointing to” relevant content from different contexts. For example, you may have a collection of shared procedure components that you can point to rather than storing the entire procedure in multiple locations.
Vasont can be used to store and manage embedded multimedia in structured content. It can also be used to provide a consistent organization and hierarchy to unstructured documents and other digital assets to provide an overall document management solution. Vasont stores both component-level graphics and unstructured documents as multimedia components.
Vasont offers a Translation Package that enables users to lower their overall translation costs by minimizing the amount of content that needs to be translated. This is possible because it keeps track of content that has already been translated and insures it is not re-translated. It also measures the amount of savings for each translation project by identifying the percentage of words that have already been translated.
It offers Translation Management that helps users manage projects and sub-projects by tracking dates, vendors, languages and status information. A translation project is a module of content that is being translated into multiple languages (i.e., a topic that is being translated into French, German, and Chinese). The sub-projects are each individual language to which the module is being translated (i.e., the specific French translation is a sub-project of the topic translation project).You can submit your projects for quote or send them for translation directly from Vasont's translation window. This window also provides word counts for each translation project.
Integration with translation vendors can be used with this package for an automated content delivery back and forth from Vasont. The translation package is used to consolidate the status information for all your translation projects in one place so you can keep your projects on schedule and lower your costs.
Saturday, October 24, 2015
In my last post, I described Teradata Unified Data Architecture™ product for big data. In today's post, I will describe Teradata partner Alteryx which provides innovative technology that can help you to get the maximum business value from your analytics using the Teradata Unified Data Architecture.™
Companies can extract the highest value from big data by combining all relevant data sources in their analysis. Alteryx makes it easy to create workflows that combine and blend data from relevant sources, bringing new and ad hoc sources of data into the Teradata Unified Data Architecture™ for rapid analysis. Analysts can collect data within this environment using connectors and SQL-H interfaces for optimal processing.
Create Business Analytics in an Easy-to-Use Workflow Environment
Using the design canvas and step-by-step, workflow-based environment of Alteryx, you can create analytics and analytic applications. With a single click, you can put those applications and answers to critical business questions in the hands of those who need them most. And when business conditions and underlying data change, Alteryx helps you iterate your analytic applications quickly and easily, without waiting for an IT organization or expensive statistical specialists.
Base Your Decisions on the Foresight of Accessible Predictive Analytics
Alteryx helps you make critical business decisions based on forward-looking, predictive analytics rather than past performance or simple guesswork. By embedding predictive analytics tools based on the R open source statistical language or any of the in-database analytic capabilities, Alteryx makes powerful statistical techniques accessible to everyone in your organization through a simple drag-and-drop interface.
Understand Where and Why Things Happen: Location Matters
Whether you are building a hyper-local marketing and merchandizing strategy or trying to understand the value of social media investments, location matters. Traditionally, this type of insight has been in the hands of a few geo specialists focused on mapping and trade areas. With Alteryx, you can put location specific intelligence in the hands of every decision maker.
With the rise of location-enabled devices such as smart phones and tablets, consumer and business interactions increasingly include a location data-point. This makes spatial analysis more critical than ever before. Alteryx provides powerful geospatial and location intelligence tools as part of any analytic workflow. You can visualize where events are taking place and make location-specific decisions.
Alteryx can push custom spatial queries into the Teradata Database to leverage its processing power and eliminate data movement. You can enrich your spatial data within the Teradata system using any or all of these functions provided by Alteryx:
- geocoding of data;
- drive-time analytics;
- trade area creation;
- spatial and demographic analysis;
- spatial and predictive analysis;
Alteryx simplifies the previously complex tasks of predictive and spatial analytics, so every employee in your organization can make critical business decisions based on real, verifiable facts.
Deliver the Right Data for the Right Question To The Right Person
To answer today’s complex business questions, you need to access your sources of insight in a single environment.That is why Alteryx allows you to bring together data from virtually any data source, whether structured, unstructured or cloud data, into an analytic application. Using Alteryx, you can extend the reach of business insight by publishing applications that let your business users run in-database analytics and get fast answers to their pressing business questions.
Teradata and Alteryx: Powerful Insights for Business Users
To exploit the opportunities of all their data, organizations need flexible data architectures as well as sophisticated analytic tools. Analysts need to rapidly gather, make sense of and derive insights from all the relevant data to make faster, more accurate strategic decisions. But given the variety of potential data sources, it is difficult for any single tool to be most effective at capturing, storing and exploring data. Using the Teradata Unified Data Architecture™ with Alteryx enables you to explore data from multiple sources, as well as the ability to deploy the insights derived from the data.
You can create sophisticated analytics, taking advantage of new, multi-structured data sources to deliver the most ROI. The combined solution:
- integrates and addresses both structured and emerging multi-structured data Leverages the Teradata Integrated Data Warehouse, Teradata Aster Discovery platform and Hadoop to optimal advantage;
- creates both in-database and cross-platform analytics quickly without requiring specialized SQL, MapReduce or R programming skills;
- lets you combine the capabilities of the Alteryx environment with routines developed in other analytical tools within a single analytical workflow;
- easily deploys analytics to the appropriate users beyond the analyst community.
The combined solution of Alteryx and the Teradata Unified Data Architecture™ provides an IT-friendly environment that supports the need to analyze data found inside and outside the data warehouse. Analysts and business users can leverage powerful engines to create and execute integrated applications. This kind of analysis is only possible with an environment that can bring together routines created by separate tools and running on different platforms.
Enhancing the Teradata Unified Data Architecture™ with the speed and agility of Alteryx creates a powerful environment for traditional and self-service analytics using integrated data and massively parallel processing platforms. It delivers:
- a complete solution for the full life-cycle of strategic and big data analytics, from transforming, enriching and loading data to designing analytic workflows and putting easy-to-use analytic applications in the hands of business users;
- improved ability to manage and extract value from structured and multi-structured data;
- ability for business analysts to create data labs and perform predictive and spatial analytics on the Teradata data warehouse and Teradata Aster discovery platforms;
- faster analytical processing within applications using in-database analytics in Teradata and SQL MapReduce functions in Teradata Aster.
The Alteryx solution helps customers with the Teradata Unified Data Architecture™ achieve these benefits by providing the following:
- robust set of analytical functions;
- access to a rich catalog of horizontal and industry-specific analytic applications in the
- Alteryx Analytics Gallery;
- syndicated household, demographic, firmographic, map and Census data to enrich existing sources;
- native data integration and in-database analytical support for Teradata data warehouse and Teradata Aster capabilities;
- ability to leverage Teradata SQL-H™ for accessing Hadoop data from Aster or Teradata Database platforms.
Use Case: Predicting and Preventing Customer Leave
A global communication service provider is interested in preventing customers leaving by identifying at-risk customers and providing special offers that reduce the likelihood of leaving in a profitable way. To do this, they need predictive analytics.
Teradata and Alteryx deliver an end-to-end analytic workflow process from data consumption and analysis to application deployment. Alteryx integrates and loads call detail records from diverse sources along with customer data from the Teradata warehouse into the Aster database to create a complete, rich data set for iterative analysis. You can run iterative discovery analysis to determine the key indicators behind customer leaving and loyalty. These key indicators are captured as repeatable applications to enrich the data warehouse with leaving and loyalty scores. In addition, the discovery analysis is captured and deployed to the business users as a parameterized application for further iterative analysis.
Key Solution Components
- Aster Discovery platform for deep analytics and segmentation;
- Teradata data warehouse to operate and deploy insights and enriched data across the enterprise;
- Alteryx for the user workflow engine to orchestrate data blending and analytics.
- Ability to identify key customers that are likely move candidates;
- determine problem spots on the network (cell sites, network elements) that are driving move;
- discover other key reasons for move (performance, competitive offers);
- discover which offers have prevented churn for similar customers in the past;
- identify which offers will work and evaluate a least-cost offer to prevent move;
- ability to make offers to keep customers from leaving;
- deeper understanding of customer behavior.
Monday, October 12, 2015
Successful companies know that analytics is the key to winning customer loyalty, optimizing business processes and beating their competitors.
By integrating data from multiple parts of the organization to enable cross-functional analysis and a 360-degree view of the customer, businesses can make the best possible decisions. With more data and more sophisticated analytics, you can realize even greater business value.
Today businesses can tap new sources of data for business analytics, including web, social, audio/video, text, sensor data and machine-generated data. But with these new opportunities come new challenges.
For example, structured data (from databases) fits easily into a relational database model with SQL-based analytics. Other semi-structured or unstructured data may require non-SQL analytics, which are difficult for business users and analysts who require SQL access and
Another challenge is identifying the nuggets of valuable data from among and between multiple data sources. Analysts need to run iterations of analysis quickly against differing data sets, using familiar tools and languages. Data discovery can be especially challenging if data is stored on multiple systems employing different technologies.
Finally, there is the challenge of simply handling all the data. New data sources often generate data at extremely high frequencies and volumes. Organizations need to capture, refine and store the data long enough to determine which data to keep, all at an affordable price.
To exploit the competitive opportunities buried in data from diverse sources, you need a strong analytic foundation capable of handling large volumes of data efficiently. Specifically, you need to address the following three capabilities:
Data Warehousing - integrated and shared data environments for managing the business and delivering strategic and operational analytics to the extended organization.
Data Discovery - discovery analytics to rapidly explore and unlock insights from big data using a variety of analytic techniques accessible to mainstream business analysts.
Data Staging - a platform for loading, storing and refining data in preparation for analytics.
Teradata Unified Data Architecture™ product includes a Teradata data warehouse platform and the Teradata Aster discovery platform for analytics, as well as open-source Apache Hadoop for data management and storage as needed.
The Teradata Active Enterprise Data Warehouse is the foundation of the integrated data warehouse solution. This appliance works well for smaller data warehouses or application-specific data marts.
For data discovery, the Teradata platform uses patented SQL-MapReduce® on the Aster Big Analytics Appliance, providing pre-packaged analytics and applications for data-driven discovery. Mainstream business users can easily access this insight using familiar SQL-based interfaces and leading business intelligence (BI) tools. If you are performing discovery on structured data, a partitioned data lab in the data warehouse is the recommended solution.
Hadoop is an effective, low-cost technology for loading, storing and refining data within the unified architecture. However, Hadoop is not designed as an analytic platform.
The Teradata Data Warehouse Appliance and the Teradata Extreme Data Appliance offer cost-effective storage and analytics for structured data. The Teradata Unified Data Architecture™ integrates these components into a cohesive, integrated data platform that delivers the following capabilities:
- unified management of both structured and unstructured data at optimal cost;
- powerful analytics spanning SQL and MapReduce analytics;
- seamless integration with the existing data warehouse environment and user skillset.
The Teradata Unified Data Architecture™ handles all types of data and diverse analytics for both business and technical users while providing an engineered, integrated and fully supported solution.
Wednesday, September 30, 2015
conceptClassifier for SharePoint is the enterprise automatic semantic metadata generation and taxonomy management solution. It is based on an open architecture with all APIs based on XML and Web Services. conceptClassifier for SharePoint supports all versions of SharePoint, SharePoint Online, Office 365, and OneDrive for Business.
Incorporating industry recognized Smart Content Framework™ and intelligent metadata enabled solutions, conceptClassifier for SharePoint provides a complete solution to manage unstructured and semi-structured data regardless of where it resides.
Utilizing unique compound term processing technology, conceptClassifier for SharePoint natively integrates with SharePoint and solves a variety of business challenges through concept identification capabilities.
- Tag content across the enterprise with conceptual metadata leveraging valuable legacy data.
- Classify consistent meaningful conceptual metadata to enterprise content, preventing incorrect meta tagging.
- Migrate tagged and classified content intelligently to locations both within and outside of SharePoint.
- Retrieve precise information from across the enterprise when and how it is needed.
- Protect sensitive information from exposure with intelligent tagging.
- Preserve information in accordance with records guidelines by identifying documents of record and eliminating inconsistent end user tagging.
Both automated and manual classification is supported to one or more term sets within the Term Store and across content hubs.
This is an advanced enterprise class, easy-to-use taxonomy and term set development and management tool. It integrates natively with the SharePoint Term Store reading and writing in real-time ensuring that the taxonomy/term set definition is maintained in only one place, the SharePoint Term Store. Designed for use by Subject Matter Experts, the Term Store and/or taxonomy is easily developed, tested, and refined.
Term Set Migration tools are also a component of conceptTaxonomyManager that enable term sets to be developed on one server (e.g. on-premise server) and then migrated to another server (e.g. Office 365 server) in an incremental fashion and preserving all GUIDs. This is a key requirement in migration.
conceptSearch Compound Term Processing Engine
Licensed for the sole use of building and refining the taxonomy/term set, the engine provides automatic semantic metadata generation that extracts multi-word terms or concepts along with keywords and acronyms. conceptSearch is an enterprise search engine and is sold as a separate product.
SharePoint Feature Set
Provides SharePoint integration and an additional multi-value pick-list browse taxonomy control enabling users to combine free text and taxonomy browse searching.
These are base platform and optional products that are needed to solve your particular business process challenge and leverage your SharePoint investment.
Search Engine Integration
This functionality is provided via conceptClassifier for SharePoint to integrate with any Microsoft search engine being used within SharePoint. conceptClassifier for SharePoint also supports integration with most non-SharePoint search engines and can perform on the fly classification with search engines calling the classify API.
Search engine support includes SharePoint, the former FAST products, Solr, Google Search Appliance, Autonomy, and IBM Vivisimo. If the FAST Pipeline Stage is required, this is sold as a separate product.
Intelligent Document Classification
This functionality is provided via conceptClassifier for SharePoint, to classify documents based upon concepts and multi-word terms that form a concept. Automatic and/or manual classification is included.
Content managers with the appropriate security can also classify content in real time. Content can be classified not only from within SharePoint but also from diverse repositories including File Shares, Exchange Public Folders, and websites. All content can be classified on the fly and classified to one or more taxonomies.
Taxonomy Management and Term Store Integration
With the Term Store functionality in SharePoint, organizations can develop a metadata model using out-of-the-box SharePoint capabilities. conceptClassifier for SharePoint provides native integration with the term store and the Managed Metadata Service application, where changes in the term store will be automatically available in the taxonomy component, and any changes in the taxonomy component will be immediately available in the term store.
A compelling advantage is the ability to consistently apply semantic metadata to content and auto-classify it to the Term Store metadata model. This solves the challenges of applying the metadata to a large number of documents and eliminates the need for end users to correctly tag content. Utilizing the taxonomy component, the taxonomies can be tested, validated, and managed, which is not a function provided by SharePoint.
Using conceptClassifier for SharePoint, an intelligent approach to migration can be achieved. As content is migrated, it is analyzed for organizationally defined descriptors and vocabularies, which will automatically classify the content to taxonomies, or optionally the SharePoint Term Store, and automatically apply organizationally defined workflows to process the content to the appropriate repository for review and disposition.
Intelligent Records Management
The ability to intelligently identify, tag, and route documents of record to either a staging library and/or a records management solution is a key component to driving and managing an effective information governance strategy. Taxonomy management, automatic declaration of documents of record, auto-classification, and semantic metadata generation are provided via conceptClassifier for SharePoint and conceptTaxonomyWorkflow.
Fully customizable to identify unique or industry standard descriptors, content is automatically meta-tagged and classified to the appropriate node(s) in the taxonomy based upon the presence of the descriptors, phrases, or keywords from within the content.
Once tagged and classified the content can be managed in accordance with regulatory or government guidelines. The identification of potential information security exposures includes the proactive identification and protection of unknown privacy exposures before they occur, as well as monitoring in real time organizationally defined vocabulary and descriptors in content as it is created or ingested. Taxonomy, classification, and metadata generation are provided via conceptClassifier for SharePoint.
eDiscovery, Litigation Support, and FOIA Requests
Taxonomy, classification, and metadata generation are provided via conceptClassifier for SharePoint. This is highly useful when relevance, identification of related concepts, vocabulary normalization are required to reduce time and improve quality of search results.
Taxonomy, classification, and metadata generation are provided via the conceptClassifier for SharePoint. A third party business intelligence or reporting tool is required to view the data in the desired format. This is useful to cleanse the data sources before using text analytics to remove content noise, irrelevant content, and identify any unknown privacy exposures or records that were never processed.
Taxonomy, classification, and metadata generation are provided via conceptClassifier for SharePoint. Integration with social networking tools can be accomplished if the tools are available in .NET or via SharePoint functionality. This is useful to provide structure to social networking applications and provide significantly more granularity in relevant information being retrieved.
Business Process Workflow
conceptTaxonomyWorkflow serves as a strategic tool managing migration activities and content type application across multiple SharePoint and non-SharePoint farms and is platform agnostic. This add-on component delivers value specifically in migration, data privacy, and records management, or in any application or business process that requires workflow capabilities.
conceptTaxonomyWorkflow is required to apply action on a document, optionally automatically apply a content type and route to the appropriate repository for disposition.
An additional add-on product, conceptContentTypeUpdater is deployed at the site collection level, can be used by site administrators, and will change the SharePoint content type based on results from pre-defined workflows and is used only in the SharePoint environment.
Where does conceptClassifier for SharePoint fill the gaps?
- SharePoint has no ability to automatically create and store classification metadata.
- SharePoint has no taxonomy management tools to manage, test, and validate taxonomies based on the Term Store.
- SharePoint has no auto-classification capabilities.
- SharePoint has no ability to generate semantic metadata and surface it to search engines to improve search results.
- SharePoint has no ability to automatically tag content with vocabulary or retention codes for records management.
- SharePoint has no ability to automatically update the content type for records management or privacy protection and route to the appropriate repository.
- SharePoint has no ability to provide intelligent migration capabilities based on the semantic metadata within content, identify previously undeclared documents of record, unidentified privacy exposures, or information that should be archived or deleted.
- SharePoint has no ability to provide granular and structured identification of people, content recommendations, and organizational knowledge assets.
Leveraging Your SharePoint Investment
When evaluating a technology purchase and the on-going investment required to deploy, customize, and maintain, the costs can scale quickly. Because conceptClassifier for SharePoint is an enterprise infrastructure component, you can leverage your investment through:
- Native real-time read/write with the term store.
- Ability to implement workflow and automatic content type updating.
- Reduce IT Staff requirements to support diverse applications.
- Reduce costs associated with the purchase of multiple, stand-alone applications
- Deploy once, utilize multiple times.
- Rapidly integrated with any SharePoint or any .Net application.
- Used by Subject Matter Experts, not IT staff, does not require outside resources to manage and maintain.
- Eliminate unproductive and manual end user tagging and the support required by business units and IT.
- Reduce hardware expansion costs due to scalability and performance features.
- Deployable as an on-premise, cloud, or hybrid solution.
Leveraging Your Business Investment
The real value of your investment includes both technology and the demonstrable ROI that can be generated from improving business processes. conceptClassifier for SharePoint has been deployed to solve individual or multiple challenges including:
- Enables concept based searching regardless of search engine.
- Reduces organizational costs associated with data exposures, remediation, litigation, fines and sanctions.
- Eliminates manual metadata tagging and human inconsistencies that prohibit accurate metadata generation.
- Prevents the portability and electronic transmission of secured assets.
- Assists in the migration of content by identifying records as well as content that should have been archived, contains sensitive information, or should be deleted.
- Protects record integrity throughout the individual document lifecycle.
- Creates virtual centralization through the ability to link disparate on-premise and off-premise content repositories.
- Ensures compliance with industry and government mandates enabling rapid implementation to address regulatory changes.
The combination of the Smart Content Framework™, conceptClassifier for SharePoint, and the deployment of intelligent metadata enabled solutions result in a comprehensive and complete approach to SharePoint enterprise metadata management. Specific benefits are:
- Eliminate manual tagging.
- Improve enterprise search.
- Facilitate records management.
- Detect and automatically secure unknown privacy exposures.
- Intelligently migrate content.
- Enhance eDiscovery, litigation support, and FOIA requests.
- Enable text analytics.
- Provide structure to Enterprise 2.0.
Friday, July 31, 2015
In my last post, I described one of the most used applications of Dublin Core Metadata - RDF. In today's post, I will describe second most used applications of Dublin Core Metadata - Web Ontology Language (OWL).
The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontology. Ontology is a formal way to describe taxonomy and classification networks, essentially defining the structure of knowledge for various domains: the nouns represent classes of objects and the verbs represent relations between the objects.
An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are used by people, databases, and applications that need to share domain information (a domain is just a specific subject area or area of knowledge, like medicine, tool manufacturing, real estate, automobile repair, financial management, etc.). Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them. They encode knowledge in a domain and also knowledge that spans domains. In this way, they make that knowledge reusable.
Ontology resembles class hierarchies. It is meant to represent information on the Internet and are expected to be evolving almost constantly. Ontologies are typically very flexible as they are coming from all sorts of data sources.
The OWL languages are characterized by formal semantics. They are built upon a W3C XML standard for RDF objects. I described RDF in my previous post.
The data described by an ontology in the OWL family is interpreted as a set of "individuals" and a set of "property assertions" which relate these individuals to each other. An ontology consists of a set of axioms which place constraints on sets of individuals (called "classes") and the types of relationships permitted between them. These axioms provide semantics by allowing systems to infer additional information based on the data explicitly provided.
OWL ontologies can import other ontologies, adding information from the imported ontology to the current ontology.
For example: an ontology describing families might include axioms stating that a "hasMother" property is only present between two individuals when "hasParent" is also present, and individuals of class "HasTypeOBlood" are never related via "hasParent" to members of "HasTypeABBlood" class. If it is stated that the individual Harriet is related via "hasMother" to the individual Sue, and that Harriet is a member of the "HasTypeOBlood" class, then it can be inferred that Sue is not a member of "HasTypeABBlood".
The W3C-endorsed OWL specification includes the definition of three variants of OWL, with different levels of expressiveness. These are OWL Lite, OWL DL and OWL Full
OWL Lite was originally intended to support those users primarily needing a classification hierarchy and simple constraints. It is not widely used.
OWL DL includes all OWL language constructs, but they can be used only under certain restrictions (for example, number restrictions may not be placed upon properties which are declared to be transitive). OWL DL is so named due to its correspondence with description logic, a field of research that has studied the logics that form the formal foundation of OWL.
OWL Full is based on a different semantics from OWL Lite or OWL DL, and was designed to preserve some compatibility with RDF Schema. For example, in OWL Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right; this is not permitted in OWL DL. OWL Full allows an ontology to augment the meaning of the pre-defined (RDF or OWL) vocabulary.
OWL Full is intended to be compatible with RDF Schema (RDFS), and to be capable of augmenting the meanings of existing Resource Description Framework (RDF) vocabulary. This interpretation provides the meaning of RDF and RDFS vocabulary. So, the meaning of OWL Full ontologies are defined by extension of the RDFS meaning, and OWL Full is a semantic extension of RDF.
Languages in the OWL family are capable of creating classes, properties, defining instances and its operations.
An instance is an object. It corresponds to a description logic individual.
A class is a collection of objects. It corresponds to a description logic (DL) concept. A class may contain individuals, instances of the class. A class may have any number of instances. An instance may belong to none, one or more classes. A class may be a subclass of another, inheriting characteristics from its parent superclass.
Class and their members can be defined in OWL either by extension or by intension. An individual can be explicitly assigned a class by a Class assertion, for example we can add a statement Queen Elizabeth is a(an instance of) human, or by a class expression with ClassExpression statements of every instance of the human class who has a female value to is an instance of the woman class.
A property is a directed binary relation that specifies class characteristics. It corresponds to a description logic role. They are attributes of instances and sometimes act as data values or link to other instances. Properties may possess logical capabilities such as being transitive, symmetric, inverse and functional. Properties may also have domains and ranges.
Datatype properties are relations between instances of classes and RDF literals or XML schema datatypes. For example, modelName (String datatype) is the property of Manufacturer class. They are formulated using owl:DatatypeProperty type.
Object properties are relations between instances of two classes. For example, ownedBy may be an object type property of the Vehicle class and may have a range which is the class Person. They are formulated using owl:ObjectProperty.
Languages in the OWL family support various operations on classes such as union, intersection and complement. They also allow class enumeration, cardinality, and disjointness.
Metaclasses are classes of classes. They are allowed in OWL full or with a feature called class/instance punning.
The OWL family of languages supports a variety of syntaxes. It is useful to distinguish high level syntaxes aimed at specification from exchange syntaxes more suitable for general use.
These are close to the ontology structure of languages in the OWL family.
OWL Abstract Syntax
This high level syntax is used to specify the OWL ontology structure and semantics.
The OWL abstract syntax presents an ontology as a sequence of annotations, axioms and facts. Annotations carry machine and human oriented metadata. Information about the classes, properties and individuals that compose the ontology is contained in axioms and facts only. Each class, property and individual is either anonymous or identified by an URI reference. Facts state data either about an individual or about a pair of individual identifiers (that the objects identified are distinct or the same). Axioms specify the characteristics of classes and properties.
OWL2 Functional Syntax
This syntax closely follows the structure of an OWL2 ontology. It is used by OWL2 to specify semantics, mappings to exchange syntaxes and profiles
OWL2 XML Syntax
OWL2 specifies an XML serialization that closely models the structure of an OWL2 ontology.
The Manchester Syntax is a compact, human readable syntax with a style close to frame languages. Variations are available for OWL and OWL2. Not all OWL and OWL2 ontologies can be expressed in this syntax.
OWL is playing an important role in an increasing number and range of applications, and is the focus of research into tools, reasoning techniques, formal foundations and language extensions.
Wednesday, July 8, 2015
The Dublin Core Schema is a small set of vocabulary terms that can be used to describe different resources.
Dublin Core Metadata may be used for multiple purposes, from simple resource description, to combining metadata vocabularies of different metadata standards, to providing inter-operability for metadata vocabularies in the Linked data cloud and Semantic web implementations.
Most used applications of Dublin Core Metadata are RDF and OWL. I will describe OWL in my next post.
RDF stands for Resource Description Framework. It is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed.
RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.
This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.
RDF Schema or RDFS is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. These resources can be saved in a triplestore to reach them with the query language SPARQL.
The first version RDFS version was published by the World-Wide Web Consortium (W3C) in April 1998, and the final W3C recommendation was released in February 2004. Many RDFS components are included in the more expressive Web Ontology Language (OWL).
Main RDFS constructs
RDFS constructs are the RDFS classes, associated properties, and utility properties built on the limited vocabulary of RDF.
Resource is the class of everything. All things described by RDF are resources.
Class declares a resource as a class for other resources.
A typical example of a Class is "Person" in the Friend of a Friend (FOAF) vocabulary. An instance of "Person" is a resource that is linked to the class "Person" using the type property, such as in the following formal expression of the natural language sentence: "John is a Person".
example: John rdf:type foaf:Person
The other classes described by the RDF and RDFS specifications are:
- Literal – literal values such as strings and integers. Property values such as textual strings are examples of literals. Literals may be plain or typed.
- Datatype – the class of datatypes. Datatype is both an instance of and a subclass of Class. Each instance of:Datatype is a subclass of Literal.
- XMLLiteral – the class of XML literal values.XMLLiteral is an instance of Datatype (and thus a subclass of Literal).
- Property – the class of properties.
Properties are instances of the class Property and describe a relation between subject resources and object resources.
For example, the following declarations are used to express that the property "employer" relates a subject, which is of type "Person", to an object, which is of type "Organization":
ex:employer rdfs:domain foaf:Person
ex:employer rdfs:range foaf:Organization
Hierarchies of classes support inheritance of a property domain and range from a class to its sub-classes:
- subPropertyOf is an instance of Property that is used to state that all resources related by one property are also related by another.
- Label is an instance of Property that may be used to provide a human-readable version of a resource's name.
- Comment is an instance of Property that may be used to provide a human-readable description of a resource.
seeAlso is an instance of Property that is used to indicate a resource that might provide additional information about the subject resource.
isDefinedBy is an instance of Property that is used to indicate a resource defining the subject resource. This property may be used to indicate an RDF vocabulary in which a resource is described.